Multivariate Analysis - Overview
|
|
- Rafe Patterson
- 6 years ago
- Views:
Transcription
1 Multivariate Analysis - Overview In general: - Analysis of Multivariate data, i.e. each observation has two or more variables as predictor variables, Analyses Analysis of Treatment Means (Single (multivariate) sample, twosamples,etc.) Study interrelationships correlations and predictions (regression) Other specific methods (discriminant analysis, principal components, clustering) Limitations: Many parameters estimated large sample sizes Whenever testing hypotheses, assumption of normality (almost overall) We ll focus on the multivariate methods and applications with somewhat limited mathematical emphasis (without proofs) Textbook: Applied Multivariate Analysis - Fifth Edition, Richard Johnson and Dean Wichern
2 Sweat data One sample testing means Table 5.1 Perspiration from 20 healthy women analyzed Three components: X 1 =sweat rate X 2 =sodium content X 3 =potasium content Null hypothesis: simultaneousely or or µ 1 = 4 µ 2 = 50 µ 3 = 10 [ 4 ] µ =[ 50] [10] µ = [4, 50,10]
3 Sweat data One sample testing means Table 5.1 Sweat rate Sodium Potasium
4 Sweat data One sample testing means Table 5.1 µ 0 = [4, 50,10] One-Sample T: Sweatrate, Sodium, Potasium Variable N Mean StDev SE Mean 90% CI Sweatrate ( , ) Sodium ( , ) Potasium ( , ) One-Sample T: Sweatrate, Sodium, Potasium Variable N Mean StDev SE Mean 95% CI Sweatrate ( , ) Sodium ( , ) Potasium ( , )
5 Sweat data One sample testing means Table 5.1 µ 0 = [4, 50,10] S-matrix Sweat rate Sweat rate Sodium Potasium Sodium Means Miu0 Xbar-Miu0 Potasium S-matrix(-1) Sweat rate Sodium Potasium Sweat rate Sodium Potasium n*(xbar-miu0) S(-1) (Xbar-Miu0) =20*.487=9.74 F=(n-p)/[(n-1)*p]*T2 =2.905 F(3,17,.9)=2.44 F(3,17,.9)=3.20
6 Turtle Carapaces- Two Samples Testing means Table 6.7 Jolicoeur and Mosimann studied relationship of size and shape for painted turtled. Measures on the caprapaces of 24 male and 24 female turtles Three components: X 1 =Length X 2 =Width X 3 =Height Null hypothesis: µ 1 = µ 2 (vectors of size 3)
7 Turtle Carapaces- Two Samples Testing means Table 6.7 Length Width Height Gender Length Width Height Gender female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male female male
8 Turtle Carapaces- Two Samples Testing means Table 6.7 ANOVA: Length, Width, Height versus Gender Gender fixed 2 female, male Analysis of Variance for Length Source DF SS MS F P Gender Error Total S = R-Sq = 26.78% R-Sq(adj) = 25.23% Analysis of Variance for Width Source DF SS MS F P Gender Error Total S = R-Sq = 28.47% R-Sq(adj) = 26.95% Analysis of Variance for Height Source DF SS MS F P Gender Error Total S = R-Sq = 42.31% R-Sq(adj) = 41.08%
9 Turtle Carapaces- Two Samples Testing means Table 6.7 Means Gender N Length Width Height female male SSCP Matrix for Gender (B) MANOVA for Gender Length Width Height s = 1 m = 0.5 n = 21.5 Length Test DF Width Criterion Statistic F Num Denom P Height Wilks' Determinant =-426 Lawley-Hotelling SSCP Matrix for Error (W) Pillai's Length Width Height Length Width Height Determinant = 6.067E+08 SSCP Matrix for Error (B+W) Length Width Height Length Width Height Determinant = 1.481E+09 Determinant (W)/Determinant (B+W)=.6067/1.481=.410
10 Notations for MANOVA- Exact F-distributions for Wilk s Lambda ~ ) ( ) ( : 2 ) ~ ) ( ) ( : 2 ) ~ ) ( ) ( : 1 ) ~ ) ( ) ( : 1 ) 1), way}min( One - { ), min( way) in One - E ( of way) -1in One - ( Hypothesis (H) of Denote in MANOVA ) 1,2( ),2( , 1 1, 1 p v p p p v v q q v p v p p p v v q q v F F q d F F p c F F q b F F p a g p In q p s N-g df v g df q + + Λ Λ Λ Λ + + Λ Λ Λ Λ = = = = = = = = = = = =
11 Turtle Carapaces- Two Samples Testing means Table 6.7 Hotelling T 2 Means Gender N Length Width Height female male Diff SSCP Matrix for Error (W) Length Width Height Length Width Height S pooled = SSCP Matrix for Error (W)/(n1+n2-2) Hotelling T 2 =(xbar1-xbar2) [S pooled (1/n1+1/n2)] -1 (xbar1-xbar2)=65.66 F= Hotelling T 2 *(n1+n2-p-1)/[(n1+n2-2)p]= 65.66*[44/(3*46)]=65.66*0.319=20.94 df=p,n1+n2-p-1=3,44 Wilks'
12 Amitriptyline Data Multivariate Regression Analysis - Table7.6 Amitriptyline drug for depression. Several side effects: irregular heartbeat, abnormal BP, etc. Data on 17 patients admited after amitriptyline overdose Two dependent variables and 5 predictor variables: Y 1 =Total TCAD plasma level (TOT) Y 2 =Amount of amitriptyline in TCAD plasma level (AMI) z 1 =Gender 1=female, 0=male z 2 =Amount of amitriptyline taken at time of overdose(amt) z 3 =PR wave measurements (PR) z 4 =Diastolic blood pressure (DIAP) z 5 =QRS wave measurements (QRS) Analysis: Model to predict Y 1 and Y 2 from the predictor variables Multivariate Linear Regression Models
13 Amitriptyline Data Multivariate Regression Analysis - Table7.6 Y1-Tot - TCAD z2-amt Antidepress z4-diap (Diastolic BP) Y2-Ami z1- Gender z3-pr z5-qrs
14 Amitriptyline Data Multivariate Regression Analysis - Table7.6 Multivariate Linear Regression Models Regression Analysis: Y1 versus Z1, Z2, Z3, Z4, Z5 The regression equation is Y1 = Z Z Z Z Z5 Predictor Coef SE Coef T P Constant Z Z Z Z Z S = R-Sq = 88.7% R-Sq(adj) = 83.6% Analysis of Variance Source DF SS MS F P Regression Residual Error Total
15 Amitriptyline Data Multivariate Regression Analysis - Table7.6 Multivariate Linear Regression Models Regression Analysis: Y2 versus Z1, Z2, Z3, Z4, Z5 The regression equation is Y2 = Z Z Z Z Z5 Predictor Coef SE Coef T P Constant Z Z Z Z Z Analysis of Variance Source DF SS MS F P Regression Residual Error Total
16 Radiotheraphy-Principal Components Table1.5 Data of the 98 average ratings over the course of the treatment for 98 patients undergoing radiotherapy. Six components: X 1 =number of symptoms (as nausea, sore throat) X 2 =amount of activity (on a 1-5 scale) X 3 =amount of sleep (on a 1-5 scale) X 4 =amount of food consumed (on a 1-3 scale) X 5 =appetite (on a 1-5 scale) X 6 =skin reaction (on a 0-3 scale) Analysis: Finding a single (or several) measures linear combinations of the 6 components, to represent patients response to therapy. Principal components
17 Radiotheraphy-Principal Components Table1.5 Symptoms Activity Sleep Eat Appetite Skin Reaction
18 Radiotheraphy-Principal Components Table1.5 X 1 =number of symptoms (as nausea, sore throat) X 2 =amount of activity (on a 1-5 scale) X 3 =amount of sleep (on a 1-5 scale) X 4 =amount of food consumed (on a 1-3 scale) X 5 =appetite (on a 1-5 scale) X 6 =skin reaction (on a 0-3 scale) Principal components and Factor Analysis Principal Component Analysis: Symptoms, Activity, Sleep, Eat, Appetite, SkinRea Eigenanalysis of the Correlation Matrix Eigenvalue Proportion Cumulative Variable PC1 PC2 PC3 PC4 PC5 PC6 Symptoms Activity Sleep Eat Appetite SkinReaction
19 Radiotheraphy- Factor Analysis Table1.5 Principal Component Factor Analysis of the Correlation Matrix Unrotated Factor Loadings and Communalities Variable Factor1 Factor2 Factor3 Factor4 Communality Symptoms Activity Sleep Eat Appetite SkinReaction Variance % Var Rotated Factor Loadings and Communalities - Varimax Rotation Variable Factor1 Factor2 Factor3 Factor4 Communality Symptoms Activity Sleep Eat Appetite SkinReaction Variance % Var
20 Hemophilia Data- Discriminant Analysis- Table 11.8 To construct a procedure for detecting potential hemophilia A carriers, blood samples assayed and measurements made on two variables: X 1 =log 10 (AHF activity) where AHF=antihemophilic factor X 2 = log 10 (AHF-like antigen) Measurements taken on two groups of women: A group of n 1 =24 women who do not carry the hemophilic gene Normal group A group n 2 =22 women from known hemophilia A carriers (daughters of hemophiliacs, mothers with more than one hemophiliac son, mothers with one hemophiliac son and other hemophilic relatives) Obligatory carriers New cases to be classified Classification and Discrimination Discriminant Analysis
21 Hemophilia Data- Discriminant Analysis- Table 11.8 Noncarriers Obligatory Carriers New cases Requiring Classific log(ahf Groupactivity) log(ahf antigen) Group log(ahf activity) log(ahf antigen) log(ahf Group activity) log(ahf antigen)
22 Hemophilia Data- Discriminant Analysis- Table 11.8 Discriminant Analysis: GroupDisc versus logactivity, logantigen Linear Method for Response: GroupDisc Predictors: logactivity, logantigen Group 1 2 Count Summary of classification True Group Put into Group Total N N correct Proportion N = 75 N Correct = 64 Proportion Correct = Squared Distance Between Groups
23 Hemophilia Data- Discriminant Analysis- Table 11.8 Discriminant Analysis: GroupDisc versus logactivity, logantigen Linear Discriminant Function for Groups 1 2 Constant logactivity logantigen Summary of Misclassified Observations True Pred Squared Observation Group Group Group Distance Probability 5** ** Prediction for Test Observations Squared Observation Pred Group From Group Distance Probability
24 Hemophilia Data- Discriminant Analysis- Table 11.8 Discriminant Analysis: -Minitab Distance and discriminant functions Squared distance: The squared distance (also called the Mahalanobis distance) of observation x to the center (mean) of group i is given by the general form: d i2 (x) = (x - m i )' S p -1 (x - m i ) where: x = p-column vector with the values of this observation m i = column vector of length p containing the means of the predictors calculated from the data in group i; S p = pooled covariance matrix, used in linear discriminant analysis; The linear discriminant function =m i ' S p -1 x - 0.5m i 'S p -1 m i + ln p where: x = column vector of length p containing the values of the predictors for this observation (note, this column vector is stored as one row) mi = column vector of length p containing the means of the predictors calculated from the data in group i S p = pooled covariance matrix ln p = natural log of the prior probability For a given x, the group with the smallest squared distance has the largest linear discriminant function.
25 Hemophilia Data- Discriminant Analysis- Table 11.8 Discriminant Analysis: -Minitab Distance and discriminant functions Posterior probability- The posterior probability for group i given the data and is calculated by: p i f i (x)/σp i f i (x) where: p i = prior probability of group i f i (x) = the joint density for the data in group i (with the population parameters replaced by sample estimates) The largest posterior probability is equivalent to the largest value of ln [p i f i (x)], where (under normality): ln [p i f i (x)] = -0.5 [d i2 (x) - 2 lnp i ] - constant value where: d i2 (x) = -2 [m i ' S p -1 x - 0.5m i 'S p -1 m i + lnp i ] + x' S p -1 x
26 Bone and Skull - White leghorn fowls Canonical Correlation- Ex10.4 To assess correlation between two sets of variables Head Measurements (X (1) ) X (1) 1 X (1) 2 =Skull length =Skull breadth Leg Measurements (X (1) ) X (2) 1 X (2) 2 =Femur length =Tibia length Create new variables which represent the most of the correlations between the two sets of variables Cannonical Correlation
27 Bone and Skull - White leghorn fowls Canonical Correlation- Ex10.4 Skull Lenghts Skull Breadth Skull Breadth Tibia Length Skull Lenghts Skull Breadth Skull Breadth Tibia Length
28 Universities Cluster Analysis-Table 12-9 Data on certain universities for certain variables used to compare or rank major universities. The variables: X 1 =Average SAT for new freshmen X 2 = Percent new freshmen in top 10% of high school class X 3 =Percent of applicants accepted X 4 =Student faculty ratio X 5 =Estimated annual expenses X 6 =Graduation rate (%) Analysis: Clustering observations (universities) based on linear combinations of variables (or specific variables). Definition of distances Cluster Analysis, Distance Methods and Ordination
29 Universities Cluster Analysis-Table 12-9 # SAT Top10 Accept SFRatio Expenses 1 Harvard Princeton Yale Stanford MIT Duke CalTech Dartmouth Brown JohnsHopkin Uchicago UPenn Cornell Northwestern Columbia NotreDame UVir Georgetown CarnegieMello Umichigan UCBerkeley Uwisconsin PennState Purdue TexasA&M
30 Universities Cluster Analysis-Table 12-9 Cluster Analysis of Observations: SAT, Top10, Accept, SFRatio, Expenses, Grad Euclidean Distance, Single Linkage Amalgamation Steps Number Number of obs. of Similarity Distance Clusters New in new Step clusters level level joined cluster cluster Notre Dame UVirginia UPenn Northwestern Stanford Dartmouth MIT Duke (Stanford, Dartmouth) (MIT, Duke)
31 Universities Cluster Analysis-Table 12-9 Dendrogram with Single Linkage and Euclidean Distance Harvard Yale Stanford Dartmouth MIT Duke Princet on Brown Upenn Nort hwest ern Cornell Not redame Uvir Columbia Georget own Uchicago UCBerkeley JohnsHopkins CarnegieMellon Umichigan Uwisconsin TexasA&M PennState Purdue CalT ech Similarity Observations
32 Universities Cluster Analysis-Table 12-9 D e n d r o g r a m w i th S i n g l e L i n k a g e a n d C o r r e l a ti o n C o e f f i c i e n t D i s ta n c e Similarity SAT Top10 Expenses Grad V a r ia b le s Accept SFRatio Cluster Analysis of Variables: SAT, Top10, Accept, SFRatio, Expenses, Grad Correlation Coefficient Distance, Single Linkage- Amalgamation Steps Number Number of obs. of Similarity Distance Clusters New in new Step clusters level level joined cluster cluster Correlations: SAT, Top10, Accept, SFRatio, Expenses, Grad SAT Top10 Accept SFRatio Expenses Top Accept SFRatio Expenses Grad
33 Universities Cluster Analysis-Table 12-9 Cluster Analysis of Observations: SAT, Top10, Accept, SFRatio, Expenses, Grad Euclidean Distance, Single Linkage Amalgamation Steps K-means Cluster Analysis: SAT, Top10, Accept, SFRatio, Expenses, Grad Number of clusters: 2 - Cluster Centroids Variable Cluster1 Cluster2 centroid SAT Top Accept SFRatio Expenses Grad Number of clusters: 3 - Cluster Centroids Grand Variable Cluster1 Cluster2 Cluster3 centroid SAT Top Accept SFRatio Expenses Grad
Profile Analysis Multivariate Regression
Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th,
More informationLeast Squares Estimation
Least Squares Estimation Using the least squares estimator for β we can obtain predicted values and compute residuals: Ŷ = Z ˆβ = Z(Z Z) 1 Z Y ˆɛ = Y Ŷ = Y Z(Z Z) 1 Z Y = [I Z(Z Z) 1 Z ]Y. The usual decomposition
More informationMultivariate Linear Regression Models
Multivariate Linear Regression Models Regression analysis is used to predict the value of one or more responses from a set of predictors. It can also be used to estimate the linear association between
More informationStat 216 Final Solutions
Stat 16 Final Solutions Name: 5/3/05 Problem 1. (5 pts) In a study of size and shape relationships for painted turtles, Jolicoeur and Mosimann measured carapace length, width, and height. Their data suggest
More informationModel Building Chap 5 p251
Model Building Chap 5 p251 Models with one qualitative variable, 5.7 p277 Example 4 Colours : Blue, Green, Lemon Yellow and white Row Blue Green Lemon Insects trapped 1 0 0 1 45 2 0 0 1 59 3 0 0 1 48 4
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More information4.1 Computing section Example: Bivariate measurements on plants Post hoc analysis... 7
Master of Applied Statistics ST116: Chemometrics and Multivariate Statistical data Analysis Per Bruun Brockhoff Module 4: Computing 4.1 Computing section.................................. 1 4.1.1 Example:
More informationMULTIVARIATE HOMEWORK #5
MULTIVARIATE HOMEWORK #5 Fisher s dataset on differentiating species of Iris based on measurements on four morphological characters (i.e. sepal length, sepal width, petal length, and petal width) was subjected
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationMULTIVARIATE ANALYSIS OF VARIANCE
MULTIVARIATE ANALYSIS OF VARIANCE RAJENDER PARSAD AND L.M. BHAR Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 lmb@iasri.res.in. Introduction In many agricultural experiments,
More informationMultiple Regression: Chapter 13. July 24, 2015
Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)
More informationMultivariate Analysis of Variance
Chapter 15 Multivariate Analysis of Variance Jolicouer and Mosimann studied the relationship between the size and shape of painted turtles. The table below gives the length, width, and height (all in mm)
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationMultivariate analysis of variance and covariance
Introduction Multivariate analysis of variance and covariance Univariate ANOVA: have observations from several groups, numerical dependent variable. Ask whether dependent variable has same mean for each
More informationTHE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam
THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay Solutions to Final Exam 1. (13 pts) Consider the monthly log returns, in percentages, of five
More informationMANOVA MANOVA,$/,,# ANOVA ##$%'*!# 1. $!;' *$,$!;' (''
14 3! "#!$%# $# $&'('$)!! (Analysis of Variance : ANOVA) *& & "#!# +, ANOVA -& $ $ (+,$ ''$) *$#'$)!!#! (Multivariate Analysis of Variance : MANOVA).*& ANOVA *+,'$)$/*! $#/#-, $(,!0'%1)!', #($!#$ # *&,
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationChapter 7, continued: MANOVA
Chapter 7, continued: MANOVA The Multivariate Analysis of Variance (MANOVA) technique extends Hotelling T 2 test that compares two mean vectors to the setting in which there are m 2 groups. We wish to
More informationOther hypotheses of interest (cont d)
Other hypotheses of interest (cont d) In addition to the simple null hypothesis of no treatment effects, we might wish to test other hypothesis of the general form (examples follow): H 0 : C k g β g p
More informationANOVA Longitudinal Models for the Practice Effects Data: via GLM
Psyc 943 Lecture 25 page 1 ANOVA Longitudinal Models for the Practice Effects Data: via GLM Model 1. Saturated Means Model for Session, E-only Variances Model (BP) Variances Model: NO correlation, EQUAL
More informationRepeated Measures Part 2: Cartoon data
Repeated Measures Part 2: Cartoon data /*********************** cartoonglm.sas ******************/ options linesize=79 noovp formdlim='_'; title 'Cartoon Data: STA442/1008 F 2005'; proc format; /* value
More informationModels with qualitative explanatory variables p216
Models with qualitative explanatory variables p216 Example gen = 1 for female Row gpa hsm gen 1 3.32 10 0 2 2.26 6 0 3 2.35 8 0 4 2.08 9 0 5 3.38 8 0 6 3.29 10 0 7 3.21 8 0 8 2.00 3 0 9 3.18 9 0 10 2.34
More informationMultivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis
Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis
More informationThe SAS System 18:28 Saturday, March 10, Plot of Canonical Variables Identified by Cluster
The SAS System 18:28 Saturday, March 10, 2018 1 The FASTCLUS Procedure Replace=FULL Radius=0 Maxclusters=2 Maxiter=10 Converge=0.02 Initial Seeds Cluster SepalLength SepalWidth PetalLength PetalWidth 1
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationCanonical Correlations
Canonical Correlations Like Principal Components Analysis, Canonical Correlation Analysis looks for interesting linear combinations of multivariate observations. In Canonical Correlation Analysis, a multivariate
More informationApplied Multivariate Analysis
Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Discriminant Analysis Background 1 Discriminant analysis Background General Setup for the Discriminant Analysis Descriptive
More informationMultivariate Linear Models
Multivariate Linear Models Stanley Sawyer Washington University November 7, 2001 1. Introduction. Suppose that we have n observations, each of which has d components. For example, we may have d measurements
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Comparisons of Several Multivariate Populations Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationDepartment of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000
Department of Mathematics & Statistics STAT 2593 Final Examination 17 April, 2000 TIME: 3 hours. Total marks: 80. (Marks are indicated in margin.) Remember that estimate means to give an interval estimate.
More informationChapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models
Chapter 14 Multiple Regression Models 1 Multiple Regression Models A general additive multiple regression model, which relates a dependent variable y to k predictor variables,,, is given by the model equation
More informationMultivariate Data Analysis Notes & Solutions to Exercises 3
Notes & Solutions to Exercises 3 ) i) Measurements of cranial length x and cranial breadth x on 35 female frogs 7.683 0.90 gave x =(.860, 4.397) and S. Test the * 4.407 hypothesis that =. Using the result
More informationExamination paper for TMA4255 Applied statistics
Department of Mathematical Sciences Examination paper for TMA4255 Applied statistics Academic contact during examination: Anna Marie Holand Phone: 951 38 038 Examination date: 16 May 2015 Examination time
More informationMANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:
MULTIVARIATE ANALYSIS OF VARIANCE MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA: 1. Cell sizes : o
More informationCovariance Structure Approach to Within-Cases
Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7
More informationApplication of Ghosh, Grizzle and Sen s Nonparametric Methods in. Longitudinal Studies Using SAS PROC GLM
Application of Ghosh, Grizzle and Sen s Nonparametric Methods in Longitudinal Studies Using SAS PROC GLM Chan Zeng and Gary O. Zerbe Department of Preventive Medicine and Biometrics University of Colorado
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationSimple Linear Regression: A Model for the Mean. Chap 7
Simple Linear Regression: A Model for the Mean Chap 7 An Intermediate Model (if the groups are defined by values of a numeric variable) Separate Means Model Means fall on a straight line function of the
More informationOrthogonal contrasts for a 2x2 factorial design Example p130
Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationSTA 437: Applied Multivariate Statistics
Al Nosedal. University of Toronto. Winter 2015 1 Chapter 5. Tests on One or Two Mean Vectors If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition Chapter 5. Tests
More informationSchool of Mathematical Sciences. Question 1
School of Mathematical Sciences MTH5120 Statistical Modelling I Practical 8 and Assignment 7 Solutions Question 1 Figure 1: The residual plots do not contradict the model assumptions of normality, constant
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 17 for Applied Multivariate Analysis Outline Multivariate Analysis of Variance 1 Multivariate Analysis of Variance The hypotheses:
More informationAn Introduction to Multivariate Methods
Chapter 12 An Introduction to Multivariate Methods Multivariate statistical methods are used to display, analyze, and describe data on two or more features or variables simultaneously. I will discuss multivariate
More informationComparisons of Several Multivariate Populations
Comparisons of Several Multivariate Populations Edps/Soc 584, Psych 594 Carolyn J Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees,
More informationCollege Desirability: A Multivariate Statistical Analysis
College Desirability: A Multivariate Statistical Analysis Andrea M Austin Terrell A Felder Lindsay M Moomaw St. Michael s College North Carolina A & T St U Baldwin-Wallace College Colchester, VT 05439
More informationDiscriminant Analysis
Discriminant Analysis V.Čekanavičius, G.Murauskas 1 Discriminant analysis one categorical variable depends on one or more normaly distributed variables. Can be used for forecasting. V.Čekanavičius, G.Murauskas
More informationData Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA
Data Analyses in Multivariate Regression Chii-Dean Joey Lin, SDSU, San Diego, CA ABSTRACT Regression analysis is one of the most used statistical methodologies. It can be used to describe or predict causal
More informationSTA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room:
STA0HF Term Test Oct 6, 005 Last Name: First Name: Student #: TA s Name: or Tutorial Room: Time allowed: hour and 45 minutes. Aids: one sided handwritten aid sheet + non-programmable calculator Statistical
More informationSMAM 314 Exam 42 Name
SMAM 314 Exam 42 Name Mark the following statements True (T) or False (F) (10 points) 1. F A. The line that best fits points whose X and Y values are negatively correlated should have a positive slope.
More informationIntroduction to Regression
Introduction to Regression Using Mult Lin Regression Derived variables Many alternative models Which model to choose? Model Criticism Modelling Objective Model Details Data and Residuals Assumptions 1
More informationChapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance
Chapter 9 Multivariate and Within-cases Analysis 9.1 Multivariate Analysis of Variance Multivariate means more than one response variable at once. Why do it? Primarily because if you do parallel analyses
More informationNon-parametric (Distribution-free) approaches p188 CN
Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationThis document contains 3 sets of practice problems.
P RACTICE PROBLEMS This document contains 3 sets of practice problems. Correlation: 3 problems Regression: 4 problems ANOVA: 8 problems You should print a copy of these practice problems and bring them
More information(1) The explanatory or predictor variables may be qualitative. (We ll focus on examples where this is the case.)
Introduction to Analysis of Variance Analysis of variance models are similar to regression models, in that we re interested in learning about the relationship between a dependent variable (a response)
More information23. Inference for regression
23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationGroup comparison test for independent samples
Group comparison test for independent samples The purpose of the Analysis of Variance (ANOVA) is to test for significant differences between means. Supposing that: samples come from normal populations
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationMultilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2
Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do
More informationSMAM 314 Practice Final Examination Winter 2003
SMAM 314 Practice Final Examination Winter 2003 You may use your textbook, one page of notes and a calculator. Please hand in the notes with your exam. 1. Mark the following statements True T or False
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE IN STATISTICS, 009 MODULE 4 : Linear models Time allowed: One and a half hours Candidates should answer THREE questions. Each question carries
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationAnalysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA
Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples
More informationNeuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:
1 Neuendorf MANOVA /MANCOVA Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y1 Y2 Y3 Y4 Like ANOVA/ANCOVA: 1. Assumes equal variance (equal covariance matrices) across cells (groups defined by
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationPART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,
Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationSTAT 360-Linear Models
STAT 360-Linear Models Instructor: Yogendra P. Chaubey Sample Test Questions Fall 004 Note: The following questions are from previous tests and exams. The final exam will be for three hours and will contain
More informationM A N O V A. Multivariate ANOVA. Data
M A N O V A Multivariate ANOVA V. Čekanavičius, G. Murauskas 1 Data k groups; Each respondent has m measurements; Observations are from the multivariate normal distribution. No outliers. Covariance matrices
More informationRerandomization to Balance Covariates
Rerandomization to Balance Covariates Kari Lock Morgan Department of Statistics Penn State University Joint work with Don Rubin University of Minnesota Biostatistics 4/27/16 The Gold Standard Randomized
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationLecture 5: Hypothesis tests for more than one sample
1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated
More informationQuiz #3 Research Hypotheses that Involve Comparing Non-Nested Models
Quiz #3 Research Hypotheses that Involve Comparing Non-Nested Models The researcher also wanted to test the hypothesis that students with internal versus external locus of control could be better distinguished
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More information[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by
Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Final June 2004 3 hours 7 Instructors Course Examiner Marks Y.P. Chaubey
More informationDISCRIMINANT ANALYSIS. 1. Introduction
DISCRIMINANT ANALYSIS. Introduction Discrimination and classification are concerned with separating objects from different populations into different groups and with allocating new observations to one
More informationInstitutionen för matematik och matematisk statistik Umeå universitet November 7, Inlämningsuppgift 3. Mariam Shirdel
Institutionen för matematik och matematisk statistik Umeå universitet November 7, 2011 Inlämningsuppgift 3 Mariam Shirdel (mash0007@student.umu.se) Kvalitetsteknik och försöksplanering, 7.5 hp 1 Uppgift
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationDiscrimination: finding the features that separate known groups in a multivariate sample.
Discrimination and Classification Goals: Discrimination: finding the features that separate known groups in a multivariate sample. Classification: developing a rule to allocate a new object into one of
More informationApplied Multivariate and Longitudinal Data Analysis
Applied Multivariate and Longitudinal Data Analysis Chapter 2: Inference about the mean vector(s) II Ana-Maria Staicu SAS Hall 5220; 919-515-0644; astaicu@ncsu.edu 1 1 Compare Means from More Than Two
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More information1. Least squares with more than one predictor
Statistics 1 Lecture ( November ) c David Pollard Page 1 Read M&M Chapter (skip part on logistic regression, pages 730 731). Read M&M pages 1, for ANOVA tables. Multiple regression. 1. Least squares with
More informationApplied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition
Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationSMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.
SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot. 2. Fit the linear regression line. Regression Analysis: y versus x y
More informationHistogram of Residuals. Residual Normal Probability Plot. Reg. Analysis Check Model Utility. (con t) Check Model Utility. Inference.
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationSimple Linear Regression. Steps for Regression. Example. Make a Scatter plot. Check Residual Plot (Residuals vs. X)
Simple Linear Regression 1 Steps for Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More informationNeuendorf MANOVA /MANCOVA. Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y4. Like ANOVA/ANCOVA:
1 Neuendorf MANOVA /MANCOVA Model: X1 (Factor A) X2 (Factor B) X1 x X2 (Interaction) Y1 Y2 Y3 Y4 Like ANOVA/ANCOVA: 1. Assumes equal variance (equal covariance matrices) across cells (groups defined by
More informationOne-Way Analysis of Variance (ANOVA)
1 One-Way Analysis of Variance (ANOVA) One-Way Analysis of Variance (ANOVA) is a method for comparing the means of a populations. This kind of problem arises in two different settings 1. When a independent
More informationAnalysis of variance, multivariate (MANOVA)
Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables
More informationSteps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?
Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?
More information