Measuring relationships among multiple responses
|
|
- Eileen Mitchell
- 6 years ago
- Views:
Transcription
1 Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses. It can be quantified as Covariance or its standardized form Correlation (r for a sample to estimate for the population) Parametric: Pearson product-moment correlation coefficient calculated by all statistical software (default in SAS) It is the covariance between pair of responses standardized based on the standard deviations of the two responses as calculated before > when to use parametric or non-parametric tests? >> a misconception is that non-parametric procedures are assumption free procedures. Non-parametric procedures are relaxed with regard to normality and equality of variance Non-parametric (in Proc CORR of Base SAS Software, SAS Procedures Guide, Procedures): > Spearman rank-order correlation (based on ranks to be used in the Pearson correlation formula) > Kendall s tau-b based on the number of correspondences and discordances of ranks in paired responses > Hoeffding s measure of dependence (D), which is a nonparametric measure of association that detects more general departures from independence. The statistic approximates a weighted sum over observations of chi-square statistics for two-by-two classification tables (Hoeffding 1948). Each set of values are cut points for the classification. 50
2 The formula for Hoeffding's D is where is the rank of, is the rank of, and (also called the bivariate rank) is 1 plus the number of points with both and values less than the point. A point that is tied on only the value or value contributes 1/2 to if the other value is less than the corresponding value for the point. A point that is tied on both and contributes 1/4 to. PROC CORR obtains the values by first ranking the data. The data are then double sorted by ranking observations according to values of the first variable and reranking the observations according to values of the second variable. Hoeffding's D statistic is computed using the number of interchanges of the first variable. When no ties occur among data set observations, the D statistic values are between -0.5 and 1, with 1 indicating complete dependence. However, when ties occur, the D statistic may result in a smaller value. That is, for a pair of variables with identical values, the Hoeffding's D statistic may be less than 1. With a large number of ties in a small data set, the D statistic may be less than For more information on Hoeffding's D, see Hollander and Wolfe (1973, p. 228). 51
3 Correlation coefficient measures linear association > if approaches zero, it only means the lack of linear association, it does not mean that the variables are not related; there may be a non-linear relation > mathematically, it ranges between 1 and 1, and thus it can be zero but you may never see r = 0 with real data How to make an inference from an r calculated from a sample to its population (testing a hypothesis)? Hypothesis testing Looking at the value of r? > how large is large? >> depends on the field Statistical test Ho: r = 0? or = 0 (which form is correct?) HA: 1) Two sided r 2 test statistics: a t-test as t, wherese (1 r )/(n 2) to be r SE r compared with a tabular ta, (n-2) from a two-sided t-table 2) One-sided test statistics: a t-test as above to be compared with a tabular t2a,, (n-2) from a two-sided t-table 52
4 Assumptions and ideal conditions for testing the hypothesis Calculation of r does not rely on any assumption but its statistical test of significance does when using the laws of probabilities 1. Independence of EUs 2. Bi-variate normality 3. Randomness of both responses > usually the case in observational studies > not the case in experimental studies where one variable (x) is fixed and the response to it (y) is random) 53
5 Example using SAS: This example produces a correlation analysis with descriptive statistics, Pearson product-moment correlation, Spearman rank-order correlation, and Hoeffding's measure of dependence, D SAS Program and DATA (I will try to explain all related SAS codes but I may miss some; ask questions when needed) options nodate pageno=1 linesize=80 pagesize=60; /*The data set FITNESS contains measurements from a study of physical fitness for 30 participants between the ages 38 and 57. Each observation represents one person. Two observations contain missing values. */ data fitness; input Age Weight Runtime datalines; ; 54
6 proc corr data=fitness Pearson Spearman Hoeffding; var weight oxygen runtime; run; The CORR Procedure Variables: Weight Oxygen Runtime Simple Statistics Variable N Mean Std Dev Median Minimum Maximum Weight Oxygen Runtime Pearson Correlation Coefficients Prob > r under H0: Rho=0 Number of Observations Weight Oxygen Runtime Weight Oxygen < Runtime <
7 Spearman Correlation Coefficients Prob > r under H0: Rho=0 Number of Observations Weight Oxygen Runtime Weight Oxygen < Runtime < Hoeffding Dependence Coefficients Prob > D under H0: D=0 Number of Observations Weight Oxygen Runtime Weight 0.99 < Oxygen < Runtime <
8 Note: P values reported in the SAS output are for two-sided tests. If your HA is directional, and the statistical results confirm the hypothesized direction, then the actual P value is half of what is reported in the SAS output. As an option in the Proc Corr statement, you may add COV to get the variance/covariance matrix. e.g., proc corr data=fitness COV; *pearson spearman hoeffding options not included here, thus, it runs Pearson Correlation by default; var weight oxygen runtime; run; 57
9 Output: Measures of Association for a Physical Fitness Study The CORR Procedure 3 Variables: Weight Oxygen Runtime Variances and Covariances Covariance / Row Var Variance / Col Var Variance / DF Weight Oxygen Runtime Weight Oxygen Runtime Why there are three values that are sometimes the same and sometimes different for the statistic (var/cov) in each cell? 58
10 The same as before Simple Statistics Pearson Correlation Coefficients Prob > r under H0: Rho=0 Number of Observations Weight Oxygen Runtime Weight Oxygen < Runtime <
11 Partial Correlation (SAS online Doc.) A partial correlation measures the strength of a relationship between two variables, while controlling the effect of one or more additional variables. The Pearson partial correlation for a pair of variables may be defined as the correlation of errors after regression on the controlling variables. Let be the set of variables to correlate. Also let and be sets of regression parameters and be the set of controlling variables, where, is the slope, and. Suppose is a regression model for given. The population Pearson partial correlation between the and the variables of given is defined as the correlation between errors and. If the exact values of and are unknown, you can use a sample Pearson partial correlation to estimate the population Pearson partial correlation. For a given sample of observations, you estimate the sets of unknown parameters and using the least-squares estimators and. Then the fitted least-squares regression model is The partial corrected sums of squares and crossproducts (CSSCP) of given are the corrected sums of squares and crossproducts of the residuals. Using these partial corrected sums of squares and crossproducts, you can calculate the partial variances, partial covariances, and partial correlations. PROC CORR derives the partial corrected sums of squares and crossproducts matrix by applying the Cholesky decomposition algorithm to the CSSCP matrix. For Pearson partial correlations, let be the partitioned CSSCP matrix between two sets of variables, and : 60
12 PROC CORR calculates, the partial CSSCP matrix of after controlling for, by applying the Cholesky decomposition algorithm sequentially on the rows associated with, the variables being partialled out. After applying the Cholesky decomposition algorithm to each row associated with variables, PROC CORR checks all higher numbered diagonal elements associated with for singularity. After the Cholesky decomposition, a variable is considered singular if the value of the corresponding diagonal element is less than times the original unpartialled corrected sum of squares of that variable. You can specify the singularity criterion using the SINGULAR= option. For Pearson partial correlations, a controlling variable is considered singular if the for predicting this variable from the variables that are already partialled out exceeds. When this happens, PROC CORR excludes the variable from the analysis. Similarly, a variable is considered singular if the for predicting this variable from the controlling variables exceeds. When this happens, its associated diagonal element and all higher numbered elements in this row or column are set to zero. After the Cholesky decomposition algorithm is performed on all rows associated with, the resulting matrix has the form where is an upper triangular matrix with 61
13 If is positive definite, then the partial CSSCP matrix is identical to the matrix derived from the formula The partial variance-covariance matrix is calculated with the variance divisor (VARDEF= option). PROC CORR can then use the standard Pearson correlation formula on the partial variance-covariance matrix to calculate the Pearson partial correlation matrix. Another way to calculate Pearson partial correlation is by applying the Cholesky decomposition algorithm directly to the correlation matrix and by using the correlation formula on the resulting matrix. To derive the corresponding Spearman partial rank-order correlations and Kendall partial tau-b correlations, PROC CORR applies the Cholesky decomposition algorithm to the Spearman rank-order correlation matrix and Kendall tau-b correlation matrix and uses the correlation formula. The singularity criterion for nonparametric partial correlations is identical to Pearson partial correlation except that PROC CORR uses a matrix of nonparametric correlations and sets a singular variable's associated correlations to missing. The partial tau-b correlations range from -1 to 1. However, the sampling distribution of this partial tau-b is unknown; therefore, the probability values are not available. When a correlation matrix (Pearson, Spearman, or Kendall tau-b correlation matrix) is positive definite, the resulting partial correlation between variables and after adjusting for a single variable is identical to that obtained from the first-order partial correlation formula where,, and are the appropriate correlations. 62
14 The formula for higher-order partial correlations is a straightforward extension of the above first-order formula. For example, when the correlation matrix is positive definite, the partial correlation between and controlling for both and is identical to the second-order partial correlation formula where,, and are first-order partial correlations among variables,, and given. 63
15 SAS Example for partial correlation: options nodate pageno=1 linesize=120 pagesize=60; proc corr data=fitness spearman kendall cov nosimple outp=fitcorr; var weight oxygen runtime; partial age; run; Partial Correlations for a Fitness and Exercise Study The CORR Procedure 1 Partial Variables: Age 3 Variables: Weight Oxygen Runtime Partial Covariance Matrix, DF = 26 Weight Oxygen Runtime Weight Oxygen Runtime
16 Pearson Partial Correlation Coefficients, N = 28 Prob > r under H0: Partial Rho=0 Weight Oxygen Runtime Weight Oxygen Runtime < <.01 Spearman Partial Correlation Coefficients, N = 28 Prob > r under H0: Partial Rho=0 Weight Oxygen Runtime Weight Oxygen Runtime
17 Kendall Partial Tau b Correlation Coefficients, N = 28 Weight Oxygen Runtime Weight Oxygen Runtime
18 Confidence Interval for Correlation Coefficient Is having the actual value of a correlation coefficient and its significance level sufficient to make a judgment regarding the association between two responses? What else could improve the inference?, Why? Measured by what? 67
19 Confidence Interval for correlation coefficient A. Graphic (Johnson D. E., p. 37) 1. r = 0.85, n = 5, C.I.: -.12 < <.95, May we test Ho with C.I.?, conclusion here? 2. r = 0.7, n =25, C.I.:.41 < <.85, conclusion regarding Ho with C.I.? 68
20 B. Fisher s Approximation: used for n > 25 {Tanh[invtanh(r) za/2 /(n-3).5 ]}< < Tanh {[invtanh (r) +za/2 / (n-3).5 ]} For r = 0.7, n =2 5, C.I.:.42 < <.86, which is close to what we obtained from the graph. C. Ruben s Approximation Computation a little complex, See Johnson D.E., page 39 for relevant SAS codes Power and sample size calculations for correlation? Use of correlation matrix in grouping variables > reducing dimensionality > relation to PCA and FA 69
Correlation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationSPSS LAB FILE 1
SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationChapter 13 Correlation
Chapter Correlation Page. Pearson correlation coefficient -. Inferential tests on correlation coefficients -9. Correlational assumptions -. on-parametric measures of correlation -5 5. correlational example
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More informationDependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.
Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,
More informationNemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015
Nemours Biomedical Research Biostatistics Core Statistics Course Session 4 Li Xie March 4, 2015 Outline Recap: Pairwise analysis with example of twosample unpaired t-test Today: More on t-tests; Introduction
More informationE509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.
E509A: Principle of Biostatistics (Week 11(2): Introduction to non-parametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189
More informationSAS Example 3: Deliberately create numerical problems
SAS Example 3: Deliberately create numerical problems Four experiments 1. Try to fit this model, failing the parameter count rule. 2. Set φ 12 =0 to pass the parameter count rule, but still not identifiable.
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationInstrumental variables regression on the Poverty data
Instrumental variables regression on the Poverty data /********************** poverty2.sas **************************/ options linesize=79 noovp formdlim='-' nodate; title 'UN Poverty Data: Instrumental
More information2.1: Inferences about β 1
Chapter 2 1 2.1: Inferences about β 1 Test of interest throughout regression: Need sampling distribution of the estimator b 1. Idea: If b 1 can be written as a linear combination of the responses (which
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationRegression without measurement error using proc calis
Regression without measurement error using proc calis /* calculus2.sas */ options linesize=79 pagesize=500 noovp formdlim='_'; title 'Calculus 2: Regression with no measurement error'; title2 ''; data
More informationDependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.
MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationN Utilization of Nursing Research in Advanced Practice, Summer 2008
University of Michigan Deep Blue deepblue.lib.umich.edu 2008-07 536 - Utilization of ursing Research in Advanced Practice, Summer 2008 Tzeng, Huey-Ming Tzeng, H. (2008, ctober 1). Utilization of ursing
More informationLecture notes on Regression & SAS example demonstration
Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also
More informationTextbook Examples of. SPSS Procedure
Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationCh. 16: Correlation and Regression
Ch. 1: Correlation and Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely to
More informationContents. Acknowledgments. xix
Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables
More informationIn many situations, there is a non-parametric test that corresponds to the standard test, as described below:
There are many standard tests like the t-tests and analyses of variance that are commonly used. They rest on assumptions like normality, which can be hard to assess: for example, if you have small samples,
More informationHandout 1: Predicting GPA from SAT
Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data
More information7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between
7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationBIOL 4605/7220 CH 20.1 Correlation
BIOL 4605/70 CH 0. Correlation GPT Lectures Cailin Xu November 9, 0 GLM: correlation Regression ANOVA Only one dependent variable GLM ANCOVA Multivariate analysis Multiple dependent variables (Correlation)
More informationData files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav
Correlation Data files for today CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav Defining Correlation Co-variation or co-relation between two variables These variables change together
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationStatistics in Stata Introduction to Stata
50 55 60 65 70 Statistics in Stata Introduction to Stata Thomas Scheike Statistical Methods, Used to test simple hypothesis regarding the mean in a single group. Independent samples and data approximately
More informationCount data page 1. Count data. 1. Estimating, testing proportions
Count data page 1 Count data 1. Estimating, testing proportions 100 seeds, 45 germinate. We estimate probability p that a plant will germinate to be 0.45 for this population. Is a 50% germination rate
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationSTATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS
STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables
More informationSPECIAL TOPICS IN REGRESSION ANALYSIS
1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented
More informationUnit 14: Nonparametric Statistical Methods
Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationFailure Time of System due to the Hot Electron Effect
of System due to the Hot Electron Effect 1 * exresist; 2 option ls=120 ps=75 nocenter nodate; 3 title of System due to the Hot Electron Effect ; 4 * TIME = failure time (hours) of a system due to drift
More informationssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm
Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationThe MIANALYZE Procedure (Chapter)
SAS/STAT 9.3 User s Guide The MIANALYZE Procedure (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for the complete
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationAssociations between variables I: Correlation
Associations between variables I: Correlation Part 2 will describe linear regression. Variable Associations Previously, we asked questions about whether samples might be from populations with the same
More informationGROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION
FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89
More informationStatistics Introductory Correlation
Statistics Introductory Correlation Session 10 oscardavid.barrerarodriguez@sciencespo.fr April 9, 2018 Outline 1 Statistics are not used only to describe central tendency and variability for a single variable.
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationSAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis
SAS/STAT 14.1 User s Guide Introduction to Nonparametric Analysis This document is an individual chapter from SAS/STAT 14.1 User s Guide. The correct bibliographic citation for this manual is as follows:
More informationMulticollinearity Exercise
Multicollinearity Exercise Use the attached SAS output to answer the questions. [OPTIONAL: Copy the SAS program below into the SAS editor window and run it.] You do not need to submit any output, so there
More informationNext is material on matrix rank. Please see the handout
B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0
More informationAn Introduction to Multivariate Statistical Analysis
An Introduction to Multivariate Statistical Analysis Third Edition T. W. ANDERSON Stanford University Department of Statistics Stanford, CA WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationSPSS Guide For MMI 409
SPSS Guide For MMI 409 by John Wong March 2012 Preface Hopefully, this document can provide some guidance to MMI 409 students on how to use SPSS to solve many of the problems covered in the D Agostino
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationContrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:
Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.
More informationOrdinal Variables in 2 way Tables
Ordinal Variables in 2 way Tables Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2018 C.J. Anderson (Illinois) Ordinal Variables
More informationPaper: ST-161. Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop UMBC, Baltimore, MD
Paper: ST-161 Techniques for Evidence-Based Decision Making Using SAS Ian Stockwell, The Hilltop Institute @ UMBC, Baltimore, MD ABSTRACT SAS has many tools that can be used for data analysis. From Freqs
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationMultiple Variable Analysis
Multiple Variable Analysis Revised: 10/11/2017 Summary... 1 Data Input... 3 Analysis Summary... 3 Analysis Options... 4 Scatterplot Matrix... 4 Summary Statistics... 6 Confidence Intervals... 7 Correlations...
More informationCovariance Structure Approach to Within-Cases
Covariance Structure Approach to Within-Cases Remember how the data file grapefruit1.data looks: Store sales1 sales2 sales3 1 62.1 61.3 60.8 2 58.2 57.9 55.1 3 51.6 49.2 46.2 4 53.7 51.5 48.3 5 61.4 58.7
More informationCorrelation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0
Correlation Bivariate normal densities with ρ 0 Example: Obesity index and blood pressure of n people randomly chosen from a population Two-dimensional / bivariate normal density with correlation 0 Correlation?
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationAppendix A Summary of Tasks. Appendix Table of Contents
Appendix A Summary of Tasks Appendix Table of Contents Reporting Tasks...357 ListData...357 Tables...358 Graphical Tasks...358 BarChart...358 PieChart...359 Histogram...359 BoxPlot...360 Probability Plot...360
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationCorrelation & Linear Regression. Slides adopted fromthe Internet
Correlation & Linear Regression Slides adopted fromthe Internet Roadmap Linear Correlation Spearman s rho correlation Kendall s tau correlation Linear regression Linear correlation Recall: Covariance n
More informationInter-Rater Agreement
Engineering Statistics (EGC 630) Dec., 008 http://core.ecu.edu/psyc/wuenschk/spss.htm Degree of agreement/disagreement among raters Inter-Rater Agreement Psychologists commonly measure various characteristics
More informationHypothesis Testing for Var-Cov Components
Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output
More informationTHE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook
BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New
More informationSAS/STAT 13.1 User s Guide. The MIANALYZE Procedure
SAS/STAT 13.1 User s Guide The MIANALYZE Procedure This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete manual is as follows: SAS
More informationChapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance
Chapter 9 Multivariate and Within-cases Analysis 9.1 Multivariate Analysis of Variance Multivariate means more than one response variable at once. Why do it? Primarily because if you do parallel analyses
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationIntroduction to Nonparametric Analysis (Chapter)
SAS/STAT 9.3 User s Guide Introduction to Nonparametric Analysis (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation for
More informationMULTIVARIATE HOMEWORK #5
MULTIVARIATE HOMEWORK #5 Fisher s dataset on differentiating species of Iris based on measurements on four morphological characters (i.e. sepal length, sepal width, petal length, and petal width) was subjected
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More information4.8 Alternate Analysis as a Oneway ANOVA
4.8 Alternate Analysis as a Oneway ANOVA Suppose we have data from a two-factor factorial design. The following method can be used to perform a multiple comparison test to compare treatment means as well
More informationτ xy N c N d n(n 1) / 2 n +1 y = n 1 12 v n n(n 1) ρ xy Brad Thiessen
Introduction: Effect of Correlation Types on Nonmetric Multidimensional Scaling of ITED Subscore Data Brad Thiessen In a discussion of potential data types that can be used as proximity measures, Coxon
More information3 Joint Distributions 71
2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random
More informationAnalysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA
Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED Maribeth Johnson Medical College of Georgia Augusta, GA Overview Introduction to longitudinal data Describe the data for examples
More informationWeek 7.1--IES 612-STA STA doc
Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ
More informationPhysics 509: Non-Parametric Statistics and Correlation Testing
Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationAnalyzing Small Sample Experimental Data
Analyzing Small Sample Experimental Data Session 2: Non-parametric tests and estimators I Dominik Duell (University of Essex) July 15, 2017 Pick an appropriate (non-parametric) statistic 1. Intro to non-parametric
More informationA Non-parametric bootstrap for multilevel models
A Non-parametric bootstrap for multilevel models By James Carpenter London School of Hygiene and ropical Medicine Harvey Goldstein and Jon asbash Institute of Education 1. Introduction Bootstrapping is
More informationBootstrapping, Randomization, 2B-PLS
Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,
More informationNonparametric Independence Tests
Nonparametric Independence Tests Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Nonparametric
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationCorrelation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?
Comment: notes are adapted from BIOL 214/312. I. Correlation. Correlation A) Correlation is used when we want to examine the relationship of two continuous variables. We are not interested in prediction.
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationPreface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of
Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures
More informationOne-way ANOVA Model Assumptions
One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random
More informationData Analysis as a Decision Making Process
Data Analysis as a Decision Making Process I. Levels of Measurement A. NOIR - Nominal Categories with names - Ordinal Categories with names and a logical order - Intervals Numerical Scale with logically
More informationUnderstand the difference between symmetric and asymmetric measures
Chapter 9 Measures of Strength of a Relationship Learning Objectives Understand the strength of association between two variables Explain an association from a table of joint frequencies Understand a proportional
More informationStructural Equation Modeling and Confirmatory Factor Analysis. Types of Variables
/4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris
More information