Solution to Series 11

Size: px
Start display at page:

Download "Solution to Series 11"

Transcription

1 Prof. Dr. M. Maathuis Multivariate Statistics SS 2014 Solution to Series a) > car <- read.table(" sep=";", header=t, na.strings="") b) As the first random vector X only has two components, we can obtain only two pairs of canonical variables. c) > X <- car[,c(6,5)] > Y <- car[,c(3,4,7,8,9,10)] > dat <- cbind(x,y) > R <- cor(dat) > R11 <- R[1:2, 1:2] > R22 <- R[3:8, 3:8] > R12 <- R[1:2, 3:8] > R21 <- R[3:8, 1:2] > R11.inv <- solve(r11) > R22.inv <- solve(r22) > # compute E1 and E2 > E1 <- R11.inv %*% R12 %*% R22.inv %*% R21 > E2 <- R22.inv %*% R21 %*% R11.inv %*% R12 > # compute eigenvalues and eigenvectors of E1 and E2: > eigen(e1) [1] [,1] [,2] [1,] [2,] > eigen(e2) [1] e e e e-17 [5] e e-17 [,1] [,2] [,3] [,4] [1,] [2,] [3,] [4,] [5,] [6,] [,5] [,6] [1,] [2,] [3,] [4,] [5,] [6,]

2 2 > # note nonzero eigenvalues are indeed the same and positive > > # compute the canonical correlation vectors: > a1 <- eigen(e1)[,1] > a2 <- eigen(e1)[,2] > b1 <- eigen(e2)[,1] > b2 <- eigen(e2)[,2] > # correct the scaling of the canonical correlation vectors: > round((a1 <- -1 * a1 / sqrt(t(a1) %*% R11 %*% a1)),2) [1] > round((a2 <- -1 * a2 / sqrt(t(a2) %*% R11 %*% a2)),2) [1] > round((b1 <- -1 * b1 / sqrt(t(b1) %*% R22 %*% b1)),2) [1] > round((b2 <- -1 * b2 / sqrt(t(b2) %*% R22 %*% b2)),2) [1] The first pair of canonical vectors thus is and the second one a 1 = ( 0.37, 0.67) b 1 = ( 0.38, +0.17, 0.00, 0.47, 0.24, 0.24) a 2 = (1.65, 1.55) d) The canonical variables are computed by u 1 = 0.37 Price Value b 2 = (0.40, 0.57, 0.05, 0.00, 0.32, 0.61) v 1 = 0.38 Economy Service Design Sport Safety Easy.h. and u 2 = 1.65 Price Value v 2 = 0.40 Economy Service 0.05 Design Sport 0.32 Safety Easy.h. e) > # canonical correlations: > sqrt(eigen(e1)) [1] The canonical correlations are g 1 = 0.98 g 2 = 0.91 The relationship between both pairs of canonical variates thus seems to be quite strong. f) > dat.std <- apply(dat, 2, scale) > # compute canonical correlation variables: > u1 <- dat.std[,1:2] %*% a1 > u2 <- dat.std[,1:2] %*% a2 > v1 <- dat.std[,3:8] %*% b1 > v2 <- dat.std[,3:8] %*% b2 > # check covariance matrix: > round(var(cbind(u1,u2,v1,v2)),3) [,1] [,2] [,3] [,4] [1,] [2,] [3,] [4,]

3 3 g) u 1 = 0.37 Price Value v 1 = 0.38 Economy Service Design Sport Safety Easy.h. From the first pair of canonical variables (u 1, v 1 ), we see that Price is positively related to Economy, and negatively related to the remaining characteristics of a car (service, sportiness, safety and easy handling). The variable Value is negatively related to Economy and positively related to the other characteristics. The canonical variable u 1 can be interpreted as a value index of the car. On the one side, we observe cars with good (low) price and bad (high) appreciation of value such as Trabant and Wartburg and on the other side, we see cars with high price and good (low) appreciation of value such as BMW, Jaguar, Ferrari and Mercedes. Similarly, v 1 can be interpreted as a quality index consisting of variables such as service and safety. The value and quality indeces are highly correlated with the canonical correlation coefficient This can be seen in the following plot: > # plot the first canonical correlation variables: > plot(u1,v1,main="quality vs. value, correlation=0.98", xlab="u1 = 'value' of car", ylab="v1 = 'quality' of car", pch="") > text(u1,v1,labels=car$type) v1 = 'quality' of car aguar Ferrari Mercedes quality vs. value, correlation=0.98 BMW Mitsubishi Rover Audi Opel Volvo Fiat Lada Citroen Ford Nissan Opel Peugeot Mazda Renault Toyota Hyundai Wartbur Traban h) u1 = 'value' of car u 2 = 1.65 Price Value v 2 = 0.40 Economy Service 0.05 Design Sport 0.32 Safety Easy.h. The second pair of canonical variables provides more insight into the relation ship between the two sets of variables. u 2 has low values for cars with good marks both in price and value, e.g., and Opel. On the right hand side, we should see cars with bad marks in these two variables such as Ferrari and

4 4 Wartburg. The canonical variable v 2 consists mainly of variables economy and service. The position of cars is displayed in the plot below. > plot(u2, v2, xlab="u2", ylab="v2", pch="", main="v2 vs. u2") > text(u2,v2, labels=car$type) v2 vs. u2 Ferrar v Wartburg Jaguar Trabant Audi Rover Lada BMW Peugeot Citroen Mazda Volvo Renault Mitsubishi Mercedes Hyundai Opel Ford Toyota Nissan Fiat Opel u2 2. a) Read the data in with > car <- read.table(" sep=";", header=t, na.strings="") > X <- car[,6] > Y <- car[,c(3,10)] > dat <- cbind(x,y) > R <- cor(dat) > R11 <- R[1, 1] > R22 <- R[2:3, 2:3] > R12 <- R[1, 2:3] > R21 <- R[2:3, 1] > R11.inv <- solve(r11) > R22.inv <- solve(r22) > # compute E1 and E2 > E1 <- R11.inv %*% R12 %*% R22.inv %*% R21 > E2 <- R22.inv %*% R21 %*% R11.inv %*% R12 > # compute eigenvalues and eigenvectors of E1 and E2: > eigen(e1) [1] 0.624

5 5 [,1] [1,] 1 > eigen(e2) [1] 6.24e e-17 [,1] [,2] [1,] [2,] > # compute the canonical correlation vectors: > a1 <- eigen(e1)[,1] > b1 <- eigen(e2)[,1] > # correct the scaling of the canonical correlation vectors: > round((a1 <- -1 * a1 / sqrt(t(a1) %*% R11 %*% a1)),2) [,1] [1,] -1 > round((b1 <- -1 * b1 / sqrt(t(b1) %*% R22 %*% b1)),2) [1] The pair of canonical vectors thus is and the canonical variables are computed by a 1 = 1 u 1 = 1 Price b 1 = ( 1.17, 0.34) v 1 = 1.17 Economy Easy.h. We observe that the price has negative influence on the canonical variable v 1 which means that price is positively related to economy and negatively related to easy handling. > # canonical correlation: > sqrt(eigen(e1)) [1] 0.79 The canonical correlation is g 1 = > dat.std <- apply(dat, 2, scale) > # compute canonical correlation variables: > u1 <- dat.std[,1] %*% a1 > v1 <- dat.std[,2:3] %*% b1 > > plot(u1, v1, xlab="u1", ylab="v1", pch="", main="v1 vs. u1") > text(u1,v1, labels=car$type)

6 6 v v1 vs. u1 Opel Ford Fiat Toyota Mazda Hyundai Renault Nissan Peugeot Citroen Lada Wartburg Mitsubishi Opel Traban Rover Audi Volvo Jaguar Mercedes Ferrari BMW u1 We can see that the relationship between the two canonical variables is not so strong as in Exercise 1 where more variables from the same data set are analyzed. b) > fit <- lm(x~economy+easy.h., data=car) > summary(fit) Call: lm(formula = X ~ Economy + Easy.h., data = car) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) Economy e-05 *** Easy.h Signif. codes: 0 *** ** 0.01 * Residual standard error: on 21 degrees of freedom Multiple R-squared: 0.624, Adjusted R-squared: F-statistic: 17.4 on 2 and 21 DF, p-value: 3.48e a) The observations 110 and 111 have missing values for lchlorophyll. > fossil <- read.table(" header=t) > dat <- fossil[,c("sangle","llength","rwidth","sst.mean","salinity","lchlorophyll")] > dat <- dat[-c(110,111),]

7 7 b) > R <- cor(dat) > R11 <- R[1:3, 1:3] > R22 <- R[4:6, 4:6] > R12 <- R[1:3, 4:6] > R21 <- R[4:6, 1:3] > R11.inv <- solve(r11) > R22.inv <- solve(r22) > # compute E1 and E2 > E1 <- R11.inv %*% R12 %*% R22.inv %*% R21 > E2 <- R22.inv %*% R21 %*% R11.inv %*% R12 > # compute eigenvalues and eigenvectors of E1 and E2: > eigen(e1) [1] [,1] [,2] [,3] [1,] [2,] [3,] > eigen(e2) [1] [,1] [,2] [,3] [1,] [2,] [3,] > # compute the canonical correlation vectors: > a1 <- eigen(e1)[,1] > a2 <- eigen(e1)[,2] > a3 <- eigen(e1)[,3] > b1 <- eigen(e2)[,1] > b2 <- eigen(e2)[,2] > b3 <- eigen(e2)[,3] > # correct the scaling of the canonical correlation vectors: > round((a1 <- -1 * a1 / sqrt(t(a1) %*% R11 %*% a1)),2) [1] > round((a2 <- -1 * a2 / sqrt(t(a2) %*% R11 %*% a2)),2) [1] > round((a3 <- -1 * a3 / sqrt(t(a3) %*% R11 %*% a3)),2) [1] > round((b1 <- -1 * b1 / sqrt(t(b1) %*% R22 %*% b1)),2) [1] > round((b2 <- -1 * b2 / sqrt(t(b2) %*% R22 %*% b2)),2) [1] > round((b3 <- -1 * b3 / sqrt(t(b3) %*% R22 %*% b3)),2) [1] The first pair of canonical variables is computed by u 1 = 1.09 sangle llength rwidth v 1 = 1.04 SST.mean 0.26 Salinity lchlorophyll

8 8 The second one by and the third on by c) > # canonical correlations: > sqrt(eigen(e1)) [1] u 2 = 0.88 sangle llength rwidth v 2 = 0.39 SST.mean 0.97 Salinity 0.16 lchlorophyll u 3 = 0.41 sangle llength 1.33 rwidth v 3 = 0.03 SST.mean 0.34 Salinity 1.04 lchlorophyll The correlations of the first two pairs of canonical variables are quite high (0.88 and 0.56), the one of the third one not anymore. This means that u 3 and v 3 are almost uncorrelated, resp. in this case that v 3 probably has no influence on u 3. On the other hand, v 1 seems to have a large influence on u 1 as well as v 2 a quite large influence on u 2. d) The canonical variable u 1 mainly seems to be the sangle. It is not easy to find an interpretation of v 1 but it seems that the SST.mean has quite a large negative influence on sangle. The same holds for lchlorophyll. Salinity seems to have a positive influence on sangle. The canonical variable u 2 could be some kind of shape of the cocolith (big cocoliths, i.e. long and with a large width which have a small angle vs. small round cocoliths with a large angle). All the environmental variables seem to have a negative influence on the shape of a cocolith, meaning that if the environmental variables take high values the cocolith will be small with a large angle. > dat.std <- apply(dat,2,scale) > # compute canonical correlation variables: > u1 <- dat.std[,1:3] %*% a1 > u2 <- dat.std[,1:3] %*% a2 > v1 <- dat.std[,4:6] %*% b1 > v2 <- dat.std[,4:6] %*% b2 > # plot canonical correlation variables: > par(mfrow=c(1,2)) > plot(v1,u1,main="sqrt(angle) vs. v1", xlab="v1", ylab="u1") > plot(u2,v2, main="shape vs. v2", xlab="v2", ylab="u2") u sqrt(angle) vs. v1 u shape vs. v v v2

HawkEye Pro. NEW and EXCLUSIVE Professional Diagnostic tool for the workshop or mobile technician. Fully unlocked for ALL Land Rover vehicles*

HawkEye Pro. NEW and EXCLUSIVE Professional Diagnostic tool for the workshop or mobile technician. Fully unlocked for ALL Land Rover vehicles* NEW and EXCLUSIVE Professional Diagnostic tool for the workshop or mobile technician Fully unlocked for ALL Land Rover vehicles* * Exclusions Apply FREELANDER DEFENDER DISCOVERY RANGE ROVER A New diagnostic

More information

Operators and the Formula Argument in lm

Operators and the Formula Argument in lm Operators and the Formula Argument in lm Recall that the first argument of lm (the formula argument) took the form y. or y x (recall that the term on the left of the told lm what the response variable

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Estimating the Market Share Attraction Model using. Support Vector Regressions

Estimating the Market Share Attraction Model using. Support Vector Regressions Estimating the Market Share Attraction Model using Support Vector Regressions Georgi I. Nalbantov Philip Hans Franses Patrick J. F. Groenen Jan C. Bioch Econometric Institute Report EI27-6 Abstract We

More information

Bias, Variance and Parsimony in Regression Analysis. ECS 256 Winter 2014

Bias, Variance and Parsimony in Regression Analysis. ECS 256 Winter 2014 Bias, Variance and Parsimony in Regression Analysis ECS 256 Winter 2014 Christopher Patton, cjpatton@ucdavis.edu Alex Rumbaugh, aprumbaugh@ucdavis.edu Thomas Provan,tcprovan@ucdavis.edu Olga Prilepova,

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

France FRANCE Q HIGHLIGHTS COVERAGE CONTENT. Country Statistics for France

France FRANCE Q HIGHLIGHTS COVERAGE CONTENT. Country Statistics for France FRANCE Q2 2008 HIGHLIGHTS France COVERAGE The area covers the countries of France, Andorra and Monaco. The NAVTEQ map of France covers 100% of the population as Prime Coverage. This release includes 1,254,870

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

G562 Geometric Morphometrics. Statistical Tests. Department of Geological Sciences Indiana University. (c) 2012, P. David Polly

G562 Geometric Morphometrics. Statistical Tests. Department of Geological Sciences Indiana University. (c) 2012, P. David Polly Statistical Tests Basic components of GMM Procrustes This aligns shapes and minimizes differences between them to ensure that only real shape differences are measured. PCA (primary use) This creates a

More information

Psychology 405: Psychometric Theory

Psychology 405: Psychometric Theory Psychology 405: Psychometric Theory Homework Problem Set #2 Department of Psychology Northwestern University Evanston, Illinois USA April, 2017 1 / 15 Outline The problem, part 1) The Problem, Part 2)

More information

Chapter 12: Linear regression II

Chapter 12: Linear regression II Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model

More information

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots

More information

Motor Trend Car Road Analysis

Motor Trend Car Road Analysis Motor Trend Car Road Analysis Zakia Sultana February 28, 2016 Executive Summary You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a collection of cars, they are

More information

General Linear Statistical Models - Part III

General Linear Statistical Models - Part III General Linear Statistical Models - Part III Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Interaction Models Lets examine two models involving Weight and Domestic in the cars93 dataset.

More information

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007 Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big

More information

Consumer Search and Prices in the Automobile Market

Consumer Search and Prices in the Automobile Market Consumer Search and Prices in the Automobile Market José Luis Moraga-González Zsolt Sándor Matthijs R. Wildenbeest First draft: December 2009 PRELIMINARY AND INCOMPLETE, COMMENTS WELCOME Abstract In many

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms For All Practical Purposes Mathematical Literacy in Today s World, 7th ed. Interpreting Histograms Displaying Distributions: Stemplots Describing

More information

Multiple Linear Regression (solutions to exercises)

Multiple Linear Regression (solutions to exercises) Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................

More information

Comparison of two non-linear model-based control strategies for autonomous vehicles

Comparison of two non-linear model-based control strategies for autonomous vehicles Comparison of two non-linear model-based control strategies for autonomous vehicles E. Alcala*, L. Sellart**, V. Puig*, J. Quevedo*, J. Saludes*, D. Vázquez** and A. López** * Supervision & Security of

More information

Safety Rules and directions for use of pullers. Mechanical Pullers

Safety Rules and directions for use of pullers. Mechanical Pullers Safety Rules and directions for use of pullers Special: Pullers with Quick-Action- Nut EXTRA Decision criterias for the identification of a suitable puller: - gripping possibility determine, whether outside

More information

Regression_Model_Project Md Ahmed June 13th, 2017

Regression_Model_Project Md Ahmed June 13th, 2017 Regression_Model_Project Md Ahmed June 13th, 2017 Executive Summary Motor Trend is a magazine about the automobile industry. It is interested in exploring the relationship between a set of variables and

More information

Canonical Correlations

Canonical Correlations Canonical Correlations Summary The Canonical Correlations procedure is designed to help identify associations between two sets of variables. It does so by finding linear combinations of the variables in

More information

6 Multivariate Regression

6 Multivariate Regression 6 Multivariate Regression 6.1 The Model a In multiple linear regression, we study the relationship between several input variables or regressors and a continuous target variable. Here, several target variables

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Class: Dean Foster. September 30, Read sections: Examples chapter (chapter 3) Question today: Do prices go up faster than they go down?

Class: Dean Foster. September 30, Read sections: Examples chapter (chapter 3) Question today: Do prices go up faster than they go down? Class: Dean Foster September 30, 2013 Administrivia Read sections: Examples chapter (chapter 3) Gas prices Question today: Do prices go up faster than they go down? Idea is that sellers watch spot price

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent

More information

Fractional Factorial Designs

Fractional Factorial Designs Fractional Factorial Designs ST 516 Each replicate of a 2 k design requires 2 k runs. E.g. 64 runs for k = 6, or 1024 runs for k = 10. When this is infeasible, we use a fraction of the runs. As a result,

More information

A Strategy to Interpret Brand Switching Data with a Special Model for Loyal Buyer Entries

A Strategy to Interpret Brand Switching Data with a Special Model for Loyal Buyer Entries A Strategy to Interpret Brand Switching Data with a Special Model for Loyal Buyer Entries B. G. Mirkin Introduction The brand switching data (see, for example, Zufryden (1986), Colombo and Morrison (1989),

More information

Linear Model Specification in R

Linear Model Specification in R Linear Model Specification in R How to deal with overparameterisation? Paul Janssen 1 Luc Duchateau 2 1 Center for Statistics Hasselt University, Belgium 2 Faculty of Veterinary Medicine Ghent University,

More information

Multiple Regression: Example

Multiple Regression: Example Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Consumer Search and Prices in the Automobile Market

Consumer Search and Prices in the Automobile Market Consumer Search and Prices in the Automobile Market José Luis Moraga-González Zsolt Sándor Matthijs R. Wildenbeest First version: December 2009 Current version: November 2010 PRELIMINARY AND INCOMPLETE,

More information

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30 Nonlinear Models 1/30 Daphnia: Purveyors of Fine Fungus 2/30 What do you do when you don t have a straight line? 7500000 Spores 5000000 2500000 0 30 40 50 60 70 longevity 3/30 What do you do when you don

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

Math 2311 Written Homework 6 (Sections )

Math 2311 Written Homework 6 (Sections ) Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

STAT 350: Summer Semester Midterm 1: Solutions

STAT 350: Summer Semester Midterm 1: Solutions Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.

More information

General Linear Statistical Models

General Linear Statistical Models General Linear Statistical Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin This framework includes General Linear Statistical Models Linear Regression Analysis of Variance (ANOVA) Analysis

More information

Lab #5 - Predictive Regression I Econ 224 September 11th, 2018

Lab #5 - Predictive Regression I Econ 224 September 11th, 2018 Lab #5 - Predictive Regression I Econ 224 September 11th, 2018 Introduction This lab provides a crash course on least squares regression in R. In the interest of time we ll work with a very simple, but

More information

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION Jan 7 2015 Charlotte Wickham stat512.cwick.co.nz Announcements TA's Katie 2pm lab Ben 5pm lab Joe noon & 1pm lab TA office hours Kidder M111 Katie Tues 2-3pm

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

North Carolina Offshore Buoy Data. Christopher Nunalee Deirdre Fateiger Tom Meiners

North Carolina Offshore Buoy Data. Christopher Nunalee Deirdre Fateiger Tom Meiners 1 North Carolina Offshore Buoy Data Christopher Nunalee Deirdre Fateiger Tom Meiners 2 Table of Contents Executive Summary 3 Description of Data..3 Statistical Analysis..4 Major Findings 7 Discussion 8

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Studies in Multivariate Statistics

Studies in Multivariate Statistics Studies in Multivariate Statistics Wolfgang Härdle Zdeněk Hlávka Cover art: Frank Wiles, The Strand Magazine, February 1927, illustration to The Adventure of the Veiled Lodger by A.C. Doyle ii Version:

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

LINK Gooding & Co auction Montery 2015 August 15/

LINK Gooding & Co auction Montery 2015 August 15/ LINK Gooding & Co auction Montery 2015 August 15/16 2015 Lot Description Lower Estimate Median Estimate Upper Estimate Hammer price Price with premium Key In lower half of estimate In upper half of estimate

More information

Principal Components. Summary. Sample StatFolio: pca.sgp

Principal Components. Summary. Sample StatFolio: pca.sgp Principal Components Summary... 1 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 8 Component Weights... 9 D and 3D Component Plots... 10 Data Table... 11 D and 3D Component

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Section 1.2 with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1

More information

Multiple Regression: Mixed Predictor Types. Tim Frasier

Multiple Regression: Mixed Predictor Types. Tim Frasier Multiple Regression: Mixed Predictor Types Tim Frasier Copyright Tim Frasier This work is licensed under the Creative Commons Attribution 4.0 International license. Click here for more information. The

More information

Chapter 8 Conclusion

Chapter 8 Conclusion 1 Chapter 8 Conclusion Three questions about test scores (score) and student-teacher ratio (str): a) After controlling for differences in economic characteristics of different districts, does the effect

More information

Multivariate Analysis of Variance

Multivariate Analysis of Variance Chapter 15 Multivariate Analysis of Variance Jolicouer and Mosimann studied the relationship between the size and shape of painted turtles. The table below gives the length, width, and height (all in mm)

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Chapter 3 - Linear Regression

Chapter 3 - Linear Regression Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to

More information

Introduction to Statistics and R

Introduction to Statistics and R Introduction to Statistics and R Mayo-Illinois Computational Genomics Workshop (2018) Ruoqing Zhu, Ph.D. Department of Statistics, UIUC rqzhu@illinois.edu June 18, 2018 Abstract This document is a supplimentary

More information

Alternator Test Leads

Alternator Test Leads Alternator Test eads $34.92 $22.74 ield 897ACRCUT "A" Circuit Adapter (89700184) 897BM iat, Bosch $22.74 $34.40 D+ AM 897AM NCON 2004+ 897BT Bosch, ucas ndustrial $19.59 $19.59 W D+ D+ W 897B Bosch, ucas,

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa18.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one. Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is

More information

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

L21: Chapter 12: Linear regression

L21: Chapter 12: Linear regression L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample

More information

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Extensions of One-Way ANOVA.

Extensions of One-Way ANOVA. Extensions of One-Way ANOVA http://www.pelagicos.net/classes_biometry_fa17.htm What do I want You to Know What are two main limitations of ANOVA? What two approaches can follow a significant ANOVA? How

More information

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs. 8 Nonlinear effects Lots of effects in economics are nonlinear Examples Deal with these in two (sort of three) ways: o Polynomials o Logarithms o Interaction terms (sort of) 1 The linear model Our models

More information

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004)

Chapter 5 Exercises 1. Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Chapter 5 Exercises 1 Data Analysis & Graphics Using R Solutions to Exercises (April 24, 2004) Preliminaries > library(daag) Exercise 2 The final three sentences have been reworded For each of the data

More information

(a) The percentage of variation in the response is given by the Multiple R-squared, which is 52.67%.

(a) The percentage of variation in the response is given by the Multiple R-squared, which is 52.67%. STOR 664 Homework 2 Solution Part A Exercise (Faraway book) Ch2 Ex1 > data(teengamb) > attach(teengamb) > tgl summary(tgl) Coefficients: Estimate Std Error t value

More information

GMM - Generalized method of moments

GMM - Generalized method of moments GMM - Generalized method of moments GMM Intuition: Matching moments You want to estimate properties of a data set {x t } T t=1. You assume that x t has a constant mean and variance. x t (µ 0, σ 2 ) Consider

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Practice Final Examination

Practice Final Examination Practice Final Examination Mth 136 = Sta 114 Wednesday, 2000 April 26, 2:20 3:00 pm This is a closed-book examination so please do not refer to your notes, the text, or to any other books You may use a

More information

Name. City Weight Model MPG

Name. City Weight Model MPG Name The following table reports the EPA s city miles per gallon rating and the weight (in lbs.) for the sports cars described in Consumer Reports 99 New Car Buying Guide. (The EPA rating for the Audii

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Principal components

Principal components Principal components Principal components is a general analysis technique that has some application within regression, but has a much wider use as well. Technical Stuff We have yet to define the term covariance,

More information

Louis Roussos Sports Data

Louis Roussos Sports Data Louis Roussos Sports Data Rank the sports you most like to participate in, 1 = favorite, 7 = least favorite. There are n=130 rank vectors. > sportsranks Baseball Football Basketball Tennis Cycling Swimming

More information

PIPERS CAR SALES Northgate Tickhill. Doncaster. DN11 9HY Tel: Contact:Stewart Piper STRATSTONE- LANDROVER

PIPERS CAR SALES Northgate Tickhill. Doncaster. DN11 9HY Tel: Contact:Stewart Piper  STRATSTONE- LANDROVER MOTOR TRADE PARTNERSHIP DONCASTER BURROWS-TOYOTA Quest Park DN2 4LT Tel: 01302 762300 Fax: 01302 762333 Contact: Andrew Brown www.burrows.co.uk HAYSELDEN-VW York Road Roundabout DN5 8AN Tel: 01302 364141

More information

Dealing with Heteroskedasticity

Dealing with Heteroskedasticity Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing

More information

Linear Models II. Chapter Key ideas

Linear Models II. Chapter Key ideas Chapter 6 Linear Models II 6.1 Key ideas Consider a situation in which we take measurements of some attribute Y on two distinct group. We want to know whether the mean of group 1, µ 1, is different from

More information

Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions.

Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions. Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions. David. Boore These examples in this document used R to do the regression. See also Notes_on_piecewise_continuous_regression.doc

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS SHOOL OF MATHEMATIS AND STATISTIS Linear Models Autumn Semester 2015 16 2 hours Marks will be awarded for your best three answers. RESTRITED OPEN BOOK EXAMINATION andidates may bring to the examination

More information

Collinearity: Impact and Possible Remedies

Collinearity: Impact and Possible Remedies Collinearity: Impact and Possible Remedies Deepayan Sarkar What is collinearity? Exact dependence between columns of X make coefficients non-estimable Collinearity refers to the situation where some columns

More information

Week 3: Multiple Linear Regression

Week 3: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 3: Multiple Linear Regression Polynomial regression, categorical variables, interactions & main effects, R 2 Max H. Farrell The University of Chicago Booth School

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Solution to Series 6

Solution to Series 6 Dr. M. Dettling Applied Series Analysis SS 2014 Solution to Series 6 1. a) > r.bel.lm summary(r.bel.lm) Call: lm(formula = NURSING ~., data = d.beluga) Residuals: Min 1Q

More information

lm statistics Chris Parrish

lm statistics Chris Parrish lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................

More information

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.

Homework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots. Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for

More information

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association

More information

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH

Math Section MW 1-2:30pm SR 117. Bekki George 206 PGH Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment Linear Regression (again) Consider the relationship

More information

Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment

More information

Regression Analysis lab 3. 1 Multiple linear regression. 1.1 Import data. 1.2 Scatterplot matrix

Regression Analysis lab 3. 1 Multiple linear regression. 1.1 Import data. 1.2 Scatterplot matrix Regression Analysis lab 3 1 Multiple linear regression 1.1 Import data delivery

More information

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Simple linear regression

Simple linear regression Simple linear regression Business Statistics 41000 Fall 2015 1 Topics 1. conditional distributions, squared error, means and variances 2. linear prediction 3. signal + noise and R 2 goodness of fit 4.

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson ) Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation

More information

Chapter 16: Understanding Relationships Numerical Data

Chapter 16: Understanding Relationships Numerical Data Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear

More information