Solution to Series 3
|
|
- Roland Tucker
- 6 years ago
- Views:
Transcription
1 Prof. Nicolai Meinshausen Regression FS 2016 Solution to Series 3 1. a) The general least-squares regression estimator is given as Using the model equation, we get in this case ( ) X T x X (1)T x (1) x (1)T x (2) x (2)T x (1) x (2)T x (2) Hence we have Plugging this into (1), we get i.e. β (X T X) 1 X T y. (1) (X T X) ρ 2 β 1 1 ρ 2 ( ) 1 ρ. ρ 1 ( ) x (1)T y ρx (2)T y x (2)T y ρx (1)T, y b) Pluggin in the model equation y Xβ + ɛ into (2) gives ( ) 1 ρ. ρ 1 β ρ 2 ( x (1) ρx (2)) T y. (2) β ρ 2 ( x (1) ρx (2)) T (Xβ + ɛ) The only random part is ɛ, so ] [ 1 ( Var [ β1 Var 1 ρ 2 x (1) ρx (2)) ] T ɛ [ n ] 1 (1 ρ 2 ) 2 Var ( ) x (1) i ρx (2) i ɛ i 1 (1 ρ 2 ) 2 n i1 i1 ( ) 2 x (1) i ρx (2) i Var[ɛi ] (since ɛ i are indep.) σ 2 ( (1 ρ 2 ) 2 x (1)T x (1) 2ρx (1)T x (2) + ρ 2 x (2)T x (2)) σ 2 (1 ρ 2 ) 2 (1 2ρ2 + ρ 2 ) σ 2 (1 ρ 2 ). Hence, for ρ close to 1 (high correlation), the variance of the least-squares estimator is large. 2. a) In the script, Chapter 1.3.3, we find In Chapter we then find n ˆσ 2 i1 (y i ŷ i ) 2. n p ˆσ 2 α ˆσ 2 ((X t X) 1 ) 11 and ˆσ 2 β ˆσ 2 ((X t X) 1 ) 22. These expressions are evaluated with the code below. The results are ˆσ , ˆσ α and ˆσ β
2 2 ## R Code library(car) data(sahlins) str(sahlins) y <- Sahlins$acres x <- Sahlins$consumers alpha.hat < beta.hat < yhat <- alpha.hat+x*beta.hat resid <- y-yhat #a) sigmahat <- sqrt(sum(resid^2)/18) xdesign <- cbind(1,x) mat <- solve(t(xdesign)%*%xdesign) sigmalpha <- sigmahat*sqrt(mat[1,1]) sigmabeta <- sigmahat*sqrt(mat[2,2]) b) The confidence intervals are calculate according to Chapter So we need ˆσ α, ˆσ β and the 97.5%-quantile of the t n p -distribution. The latter can be obtained with qua <- qt(0.975,18). We then get V I 0.95 (α) [ˆα qua ˆσ α, ˆα + qua ˆσ α ] [0.3915, ] and similarly for ˆβ V I 0.95 (β) [ ˆβ qua ˆσ β, ˆβ + qua ˆσ β ] [ , ]. For the t-statistics of the nullhypotheses α 0 and β 0 we obtain ˆα/ˆσ α and ˆβ/ˆσ β The distribution function of the t-distribution is obtained with pt(). The p-values are then calculated according to the following: ## R Code pvalalpha <- 2*(1-pt(2.9368,18)) # gives pvalbeta <- 2*(1-pt(1.7197,18)) # gives We want to check now these results with the functions lm() and confint. We obtain > lmobj <- lm(y~x) > summary(lmobj) Call: lm(formula y ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) ** x Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 18 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 1 and 18 DF, p-value: > confint(lmobj,level0.95)
3 3 2.5 % 97.5 % (Intercept) x c) We cannot conclude that the slope is different from zero. The corresponding p-value is The p-value of the intercept is meaning that we can reject the nullhypothesis α 0 on a 1% level and conclude that the intercept is different from zero. 3. a) The scatterplot is generated with the code below. The desired contries are also obtained with this code. Scatterplot LifeExp People.per.TV People.per.Dr ## R Code data <- read.table(" str(data) 'data.frame': 40 obs. of 5 variables: $ LifeExp : num $ People.per.TV : num $ People.per.Dr : int $ LifeExp.Male : int $ LifeExp.Female: int plot(data[,c(1,2,3)]) # we see above that variables 1,2,3 # are relevant countries <- row.names(data) # vector with all countries indizele <- order(data[,1],decreasingtrue) # gives indices of expectation of life # ordered by magnitude countries[indizele[1:3]] # 3 contries with the highes exp. of life "Japan" "Italy" "Spain" Analogeous, all the other desired countries can be obtained (Burma, Ethopia, Bangladesh and Ethopia, Tanzania, Zaire). b) We see that the second column of the data set has missing values. The rows can be determined with is.na(), which() and which(is.na(data[,2])). They can then be removed with datanew <- data[-c(32,40),]. The remaining R code is given in the following tv <- datanew$people.per.tv le <- datanew$lifeexp dr <- datanew$people.per.dr # to simplify lmobj <- lm(le~log2(tv)+log2(dr)) # gives fit summary(lmobj) Call: lm(formula le ~ log2(tv) + log2(dr)) Residuals:
4 4 Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** log2(tv) e-05 *** log2(dr) ** Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 35 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: 64.6 on 2 and 35 DF, p-value: 1.788e-12 So we obtain for the intercept 90.62, for β 1 and for β 2 for the model le α + β 1 log 2 (tv) + β 2 log 2 (dr) + ɛ. So doubling the number of people per TV leads to a decrease in the expectation of life of 2.02 years. c) No. A correlation is not necessarily a causal dependence. It is possible that the correlation comes from confounding variables. In our case, this means that there is a variable with influence on the two other variables under consideration. Correlation doesn t imply causation! However, we can predict the expectation of life with the number of people per TV. Perhaps, there is no causal connection, but we can use this for prediction anyway. 4. a) R result: > pairs(basisch) We see that there is a negative linear correlation between h.quad and ph. The ph-values are all above 7 and thus indicate basic soils. The influence of l.sar is not clear. There are two outliers. b) R result: > summary(lm(h.quad ~ ph + l.sar, databasisch)) Call: lm(formula h.quad ~ ph + l.sar, data basisch) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-14 *** ph e-11 *** l.sar Residual standard error: on 120 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 120 degrees of freedom, p-value: 0 c) On the 5%-level we cannot reject H 0 : β 2 0 (p-value0.1153). So the variable l.sar is not useful.
5 ph l.sar height h.quad d) R result: > new.pt <- data.frame(ph8, l.sar1) > conf.interval <- predict(basisch.lm,new.pt,int"confidence",level.95) fit lwr upr [1,] > pred.interval <- predict(basisch.lm,new.pt,int"prediction",level.95) fit lwr upr [1,] fitfitted value, lwrlower bound of intervall, uprupper bound of interval. The prediction interval for the height can be obtained by solving for y 0 and taking the square root (see Chapter (f)). For the confidence interval of the height we cannot do that (see Chapter (e)). A regression with height cannot be done since some model assumptions are not fulfilled (hints: Q-Q-plot, Tukey-Anscombe-plot). In practice, sometimes the square-roots are taken anyway since E[ y 0 ] and E[y 0 ] are equal to first order. Or the interval is found with simulations. 5. a) R results: > wdi.select <- wdi2005[,c(780, 515, 1196, 455)] > wdi.select[,1] <- log(wdi.select[,1]) > fit <- lm(wdi.select[,1] ~wdi.select[,2] + wdi.select[,3] + wdi.select[,4] ) > summary(fit) Call: lm(formula wdi.select[, 1] ~ wdi.select[, 2] + wdi.select[, 3] + wdi.select[, 4]) Residuals: Min 1Q Median 3Q Max
6 6 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 5.002e e < 2e-16 *** wdi.select[, 2] e e e-05 *** wdi.select[, 3] e e ** wdi.select[, 4] e e e-08 *** Signif. codes: 0 *** ** 0.01 * Residual standard error: on 63 degrees of freedom (181 observations deleted due to missingness) Multiple R-squared: ,Adjusted R-squared: F-statistic: on 3 and 63 DF, p-value: < 2.2e-16 b) R code and output: > confint(fit) 2.5 % 97.5 % (Intercept) e e+00 wdi.select[, 2] e e-02 wdi.select[, 3] e e-03 wdi.select[, 4] e e-05 The confidence interval is ( , ) The confidence interval does not contain zero, so the effect of Social contributions is significant at the 5% level. c) We can calulate the bootstrap confidence interval using the function boot and boot.ci as follows. require(boot) conf.int <- function(u,i){ bs <- u[i,] fit <- lm(bs[,1] ~bs[,2]+ bs[,3] + bs[,4] ) fit$coefficients[3] } set.seed(2) bs <- boot(data wdi.select, statistic conf.int, R1000) boot.ci(bs,type c("basic"),replacet) R output: BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates CALL : boot.ci(boot.out bs, type c("basic"), replace T) Intervals : Level Basic 95% ( , ) Calculations and Intervals on Original Scale
Inference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationChapter 12: Linear regression II
Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationSTAT 215 Confidence and Prediction Intervals in Regression
STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationMAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik
MAT2377 Rafa l Kulik Version 2015/November/26 Rafa l Kulik Bivariate data and scatterplot Data: Hydrocarbon level (x) and Oxygen level (y): x: 0.99, 1.02, 1.15, 1.29, 1.46, 1.36, 0.87, 1.23, 1.55, 1.40,
More informationMODELS WITHOUT AN INTERCEPT
Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationModel Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007
Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More informationCoefficient of Determination
Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationCorrelated Data: Linear Mixed Models with Random Intercepts
1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise
More informationScatter plot of data from the study. Linear Regression
1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25
More informationFigure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim
0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationMultiple Linear Regression (solutions to exercises)
Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationChapter 12: Multiple Linear Regression
Chapter 12: Multiple Linear Regression Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 55 Introduction A regression model can be expressed as
More informationRegression on Faithful with Section 9.3 content
Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationRegression and correlation
6 Regression and correlation The main object of this chapter is to show how to perform basic regression analyses, including plots for model checking and display of confidence and prediction intervals.
More informationAnalytics 512: Homework # 2 Tim Ahn February 9, 2016
Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction
More informationStatistics for Engineers Lecture 9 Linear Regression
Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April
More informationMultiple Linear Regression
Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationStatistics - Lecture Three. Linear Models. Charlotte Wickham 1.
Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions
More informationLinear Regression Model. Badr Missaoui
Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus
More informationMatrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =
Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationRegression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.
Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationNote on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin
Note on Bivariate Regression: Connecting Practice and Theory Konstantin Kashin Fall 2012 1 This note will explain - in less theoretical terms - the basics of a bivariate linear regression, including testing
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 12, 2015 List of Figures in this document by page: List of Figures 1 Time in days for students of different majors to find full-time employment..............................
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More information1 The Classic Bivariate Least Squares Model
Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating
More informationMath 2311 Written Homework 6 (Sections )
Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationMultiple comparison procedures
Multiple comparison procedures Cavan Reilly October 5, 2012 Table of contents The null restricted bootstrap The bootstrap Effective number of tests Free step-down resampling While there are functions in
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationSLR output RLS. Refer to slr (code) on the Lecture Page of the class website.
SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationUsing R in 200D Luke Sonnet
Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random
More informationThe Big Picture. Model Modifications. Example (cont.) Bacteria Count Example
The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationSTAT 572 Assignment 5 - Answers Due: March 2, 2007
1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationlm statistics Chris Parrish
lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationCorrelation and Regression
Correlation and Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven All models are wrong. Some models are useful. George Box the statistician knows that in nature there never was a
More informationIntroduction to Linear Regression Rebecca C. Steorts September 15, 2015
Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using
More informationLinear Modelling: Simple Regression
Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation
More informationChapter 3 - Linear Regression
Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to
More informationHomework 9 Sample Solution
Homework 9 Sample Solution # 1 (Ex 9.12, Ex 9.23) Ex 9.12 (a) Let p vitamin denote the probability of having cold when a person had taken vitamin C, and p placebo denote the probability of having cold
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationChapter 8: Simple Linear Regression
Chapter 8: Simple Linear Regression Shiwen Shen University of South Carolina 2017 Summer 1 / 70 Introduction A problem that arises in engineering, economics, medicine, and other areas is that of investigating
More informationTopics on Statistics 2
Topics on Statistics 2 Pejman Mahboubi March 7, 2018 1 Regression vs Anova In Anova groups are the predictors. When plotting, we can put the groups on the x axis in any order we wish, say in increasing
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationStatistical View of Least Squares
May 23, 2006 Purpose of Regression Some Examples Least Squares Purpose of Regression Purpose of Regression Some Examples Least Squares Suppose we have two variables x and y Purpose of Regression Some Examples
More informationAMS-207: Bayesian Statistics
Linear Regression How does a quantity y, vary as a function of another quantity, or vector of quantities x? We are interested in p(y θ, x) under a model in which n observations (x i, y i ) are exchangeable.
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationWorkshop 7.4a: Single factor ANOVA
-1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationRegression diagnostics
Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationStatistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz
Dept of Information Science j.nerbonne@rug.nl incl. important reworkings by Harmut Fitz March 17, 2015 Review: regression compares result on two distinct tests, e.g., geographic and phonetic distance of
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationApplied Regression Analysis
Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationy i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x
Question 1 Suppose that we have data Let x 1 n x i px 1, y 1 q,..., px n, y n q. ȳ 1 n y i s 2 X 1 n px i xq 2 Throughout this question, we assume that the simple linear model is correct. We also assume
More informationStat 401B Final Exam Fall 2015
Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationSTATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002
Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationDr. Allen Back. Sep. 23, 2016
Dr. Allen Back Sep. 23, 2016 Look at All the Data Graphically A Famous Example: The Challenger Tragedy Look at All the Data Graphically A Famous Example: The Challenger Tragedy Type of Data Looked at the
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More information