WORKSHOP 3 Measuring Association

Size: px
Start display at page:

Download "WORKSHOP 3 Measuring Association"

Transcription

1 WORKSHOP 3 Measuring Association Concepts Analysing Categorical Data o Testing of Proportions o Contingency Tables & Tests o Odds Ratios Linear Association Measures o Correlation o Simple Linear Regression Analysis Workshop 3 ~ Measuring Association Page 1 of 1

2 Analysing Categorical Data A review of methods used to describe the relationship between categorical variables / comparison of proportions. o Contingency Tables & Tests Goodness of Fit Association / Independence o Odds Ratios Testing of Proportions ~ can also calculate C.I.s and apply z-test to proportion(s). (Less common approach (REF: 1.7)) Contingency Tables & Tests types of test Goodness of fit Tests of association and independence Goodness of Fit Test Tests whether distribution of a variable conforms to an expected distribution. Workshop 3 ~ Measuring Association Page of 1

3 Example: (REF: Chapter 1) Snapdragon flowers can be coloured red, pink or white. According to Mendelian genetic model, self-pollinated pink flowers should produce progeny plants that are red, pink or white with ratio: 1::1 respectively. => H : Pr(R) =.5; Pr(P) =.5; Pr(W) =.5 Sample of 3 plants produce following colours: Red 5 Pr(R).31 Pink 1 Pr(P).51 White 5 Pr(W). To test H, USE CHI-SQUARE TEST, χ test χ ( O E) = E where O is observed frequency and E is expected frequency Calculations: O E (O-E) /E Red Pink White Compare with χ with (# of categories 1) DF. Workshop 3 ~ Measuring Association Page 3 of 1

4 Pr(χ >.5 from χ DF) =.7 As p-value >.5 (signif level = 5%) we cannot reject H. Note: Critical χ DF = 5 % significance level & Calculated χ is < Critical Value so cannot reject H. Tests of Association & Independence Example: The CF_Genotypes data set contains where patients were genotyped for a specific genetic variation and the patients who were with infected with Pseudomonas aeruginosa were recorded. The expectation was that those with the less common A variant would have more severe disease. SPSS Analysis (Analyse>Descriptive Statistics > Crosstabs) PA Infection Present * API Genotype Variant Crosstabulation OBSERVED Count API Genotype Variant Total A G PA Infection Present No Yes 1 Total Workshop 3 ~ Measuring Association Page of 1

5 H : Rate of PA infection present in both genotypes is the same General Formula for Expected Frequencies: E = row total X column total overall total From SPSS PA Infection Present * API Genotype Variant Crosstabulation API Genotype Variant Total A G PA Infection Present No Count Expected Count (O-E) Residual Yes Count 1 Expected Count (O-E) Residual Total Count Expected Count Chi-Square Tests (sample output) Value df Asymp. Sig. (- sided) Pearson Chi Square Fisher's Exact Test N of Valid 1 Exact Sig. (- sided) Exact Sig. (1- sided).7.17 Cases b 1 cells (5.%) have expected count less than 5. The minimum expected count is.9. Workshop 3 ~ Measuring Association Page 5 of 1

6 Compare with χ with 1 DF. Note: DF = (# rows 1) X (# cols 1) 5% sig. Level cannot reject H. => There is No statistically significant evidence that PA infection rates are higher in the Genotype A group. Odds Ratios Odds of Event E is defined as the ratio of the chance that E occurs v s the chance that E does not occur. Let Pr(E) be the probability (chance) of E occurring => 1 Pr(E) is the probability of E not occurring Odds of E = Pr (E) 1 Pr(E) Example o If the probability of E is ¼, then the Odds of E are {¼ / ¾} = 1/3 or 1:3 o If the probability of E is ½, then Odds of E are 1. Odds Ratio, θ is ratio of odds of two events (or conditions). Example ~ Event 1: Low birth weight in smokers; Event : Low birth weight in non-smokers (REF: 1.9) Workshop 3 ~ Measuring Association Page of 1

7 CF Genotype Data Example API Genotype Variant Total A G PA Infection Present No 1 (n 11 ) 1 (n 1 ) 1 Yes (n 1 ) 1 (n ) Total Odds Ratio compares: o Odds of PA infection in Genotype A group, Odds A with o Odds of PA infection in Genotype G group, Odds G Odds A = /1 = /1 Odds G = 1/1 =.1 1 1/1 Odds Ratio, θˆ =.333 /.1 =.5 => Estimate that the odds of a contracting a PA Infection for patients in Genotype A group are more than twice that for patients in Genotype G group. Workshop 3 ~ Measuring Association Page 7 of 1

8 Note: 1. For X Contingency Table θˆ = n 11 X n n 1 X n 1. Odds Ratio is not Normally Distributed but Log Odds Ratio is. We usually work with C.I. for log Odds Ratio and present results as Exponential of C.I. 3. If Exp of C.I. includes 1, it is possible that odds of both events are equal. Workshop 3 ~ Measuring Association Page of 1

9 Linear Association Measures A review of methods used to describe a LINEAR relationship between continuous variables. o Correlation o Simple linear regression Correlation Describes the strength of a linear relationship between continuous variables Correlation Coefficient range: -1 to 1 o -1 => Perfect Negative Linear Relationship o 1 => Perfect Positive Linear Relationship o => No Linear Relationship Y X Workshop 3 ~ Measuring Association Page 9 of 1

10 1 1 1 Y X 1 Y X 1 1 Y X Workshop 3 ~ Measuring Association Page 1 of 1

11 Simple Linear Regression (SLR) Method of estimating the linear relationship between continuous variables Terminology: o Y: Dependent variable, variable to be predicted o X: Independent variable, Explanatory variable SLR parameters Objective is to estimate straight line that describes relationship between Y & X. Regression Line: Y = α + βx + ε, where error, ε ~ N (,σ ) Require method to estimate α and β. Use method of Least Squares Find estimators, αˆ and βˆ such that S = n i = 1 ( ˆ ) αˆ β y i x i ANOVA for SLR: is minimized o Test H : β = v s H A : β o Divide the total variation in data into: variation due to Regression Line Workshop 3 ~ Measuring Association Page 11 of 1

12 residual variation o Total Variation = Regression + Residual source of df sum of squares mean F- variation square ratio Regression 1 Regression SS SS/df MS reg Residual (Error) Total n- Residual SS SS/df n-1 Total SS MS res Sig. Pr (F < F-ratio) If sig < Significance Level, then reject H. Conclude β and there is evidence of a linear relationship between Y and X. Note: From ANOVA table, MS res provides an unbiased estimate of the random, unexplained variation in the data; i.e. an unbiased estimate of σ Workshop 3 ~ Measuring Association Page 1 of 1

13 R : Co-efficient of Determination The proportion of variation in Y that is attributed to its linear regression on X R Regression Sum = Total Sum of of Squares Squares = S S xx xy S yy Range: 1 Closer to 1 => Better fit of regression line to data R = (Correlation Co-efficient) EXAMPLE Lung Function Data Set FVC (forced vital capacity) and FEV (forced expiratory volume) measure the volume capacity of the lung and air volume expired. Both are standard measurements of lung function and are expected to be highly correlated. Dependent Variable: Y ~ FEV Independent Variable: X ~ FVC SPSS Analysis Scatter plot of FEV v s FVC Workshop 3 ~ Measuring Association Page 13 of 1

14 Forced Expiratory Volume Forced Lung Capacity Correlation & R (SPSS: Analyze > Regression > Linear ) Model Summary Model R R Square Adjusted R Std. Error of the Estimate Square a Predictors: (Constant), Forced Lung Capacity b Dependent Variable: Forced Expiratory Volume Workshop 3 ~ Measuring Association Page 1 of 1

15 ANOVA Table Testing H : β = ANOVA Model Sum of Squares df Mean Square F Sig. 1 Regression Residual Total a Predictors: (Constant), Forced Lung Capacity b Dependent Variable: Forced Expiratory Volume As sig <.5 => there is strong evidence of linear relationship between FEV and FVC. Regression Estimators Coefficients Unstandardized Coefficients t Sig. 95% Confidence Interval for B Model B Std. Error Lower Bound Upper Bound 1 (Constant) Forced Lung Capacity a Dependent Variable: Forced Expiratory Volume Regression Line: FEV = * FVC T-test of β significant => evidence of linear relationship Workshop 3 ~ Measuring Association Page 15 of 1

16 Error Diagnostics Histogram Dependent Variable: Forced Expiratory Volume Frequency Std. Dev = 1. Mean =. N = 11. Regression Standardized Residual Normal P-P Plot of Regression Stand Dependent Variable: Forced Expirato Expected Cum Prob Observed Cum Prob Workshop 3 ~ Measuring Association Page 1 of 1

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and

More information

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or.

: The model hypothesizes a relationship between the variables. The simplest probabilistic model: or. Chapter Simple Linear Regression : comparing means across groups : presenting relationships among numeric variables. Probabilistic Model : The model hypothesizes an relationship between the variables.

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV) Program L13 Relationships between two variables Correlation, cont d Regression Relationships between more than two variables Multiple linear regression Two numerical variables Linear or curved relationship?

More information

BMI 541/699 Lecture 22

BMI 541/699 Lecture 22 BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based

More information

13.1 Categorical Data and the Multinomial Experiment

13.1 Categorical Data and the Multinomial Experiment Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)

More information

MATH ASSIGNMENT 2: SOLUTIONS

MATH ASSIGNMENT 2: SOLUTIONS MATH 204 - ASSIGNMENT 2: SOLUTIONS (a) Fitting the simple linear regression model to each of the variables in turn yields the following results: we look at t-tests for the individual coefficients, and

More information

Multiple linear regression

Multiple linear regression Multiple linear regression Course MF 930: Introduction to statistics June 0 Tron Anders Moger Department of biostatistics, IMB University of Oslo Aims for this lecture: Continue where we left off. Repeat

More information

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II)

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) 1/45 Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) Dr. Yen-Yi Ho (hoyen@stat.sc.edu) Feb 9, 2018 2/45 Objectives of Lecture 6 Association between Variables Goodness

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks

More information

Practical Biostatistics

Practical Biostatistics Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

SPSS LAB FILE 1

SPSS LAB FILE  1 SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables. Regression Analysis BUS 735: Business Decision Making and Research 1 Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn how to estimate

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Assoc.Prof.Dr. Wolfgang Feilmayr Multivariate Methods in Regional Science: Regression and Correlation Analysis REGRESSION ANALYSIS

Assoc.Prof.Dr. Wolfgang Feilmayr Multivariate Methods in Regional Science: Regression and Correlation Analysis REGRESSION ANALYSIS REGRESSION ANALYSIS Regression Analysis can be broadly defined as the analysis of statistical relationships between one dependent and one or more independent variables. Although the terms dependent and

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar

Multiple Regression and Model Building Lecture 20 1 May 2006 R. Ryznar Multiple Regression and Model Building 11.220 Lecture 20 1 May 2006 R. Ryznar Building Models: Making Sure the Assumptions Hold 1. There is a linear relationship between the explanatory (independent) variable(s)

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Finding Relationships Among Variables

Finding Relationships Among Variables Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

LOOKING FOR RELATIONSHIPS

LOOKING FOR RELATIONSHIPS LOOKING FOR RELATIONSHIPS One of most common types of investigation we do is to look for relationships between variables. Variables may be nominal (categorical), for example looking at the effect of an

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi)

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi) Regression (, Lingkungan kerja dan ) Descriptive Statistics Mean Std. Deviation N 3.87.333 32 3.47.672 32 3.78.585 32 s Pearson Sig. (-tailed) N Kemampuan Lingkungan Individu Kerja.000.432.49.432.000.3.49.3.000..000.000.000..000.000.000.

More information

Topic 14: Inference in Multiple Regression

Topic 14: Inference in Multiple Regression Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference

More information

Six Sigma Black Belt Study Guides

Six Sigma Black Belt Study Guides Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6 STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression 1 Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor)

More information

Correlation and Regression Bangkok, 14-18, Sept. 2015

Correlation and Regression Bangkok, 14-18, Sept. 2015 Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength

More information

Three-Way Contingency Tables

Three-Way Contingency Tables Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep

More information

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics.

BIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics. BIOS 6222: Biostatistics II Instructors: Qingzhao Yu Don Mercante Cruz Velasco 1 Outline Course Presentation Review of Basic Concepts Why Nonparametrics The sign test 2 Course Presentation Contents Justification

More information

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: ) NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

10: Crosstabs & Independent Proportions

10: Crosstabs & Independent Proportions 10: Crosstabs & Independent Proportions p. 10.1 P Background < Two independent groups < Binary outcome < Compare binomial proportions P Illustrative example ( oswege.sav ) < Food poisoning following church

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

STATISTICS. Multiple regression

STATISTICS. Multiple regression STATISTICS Multiple regression Problem : Explain the price of a ski pass. 2 3 4 Model (Constant) nb pistes SPSS results Unstandardized Coefficients a. Dependent Variable: prix forfait jour Coefficients

More information

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population

Online supplement. Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in. Breathlessness in the General Population Online supplement Absolute Value of Lung Function (FEV 1 or FVC) Explains the Sex Difference in Breathlessness in the General Population Table S1. Comparison between patients who were excluded or included

More information

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed) Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships

More information

SPSS Guide For MMI 409

SPSS Guide For MMI 409 SPSS Guide For MMI 409 by John Wong March 2012 Preface Hopefully, this document can provide some guidance to MMI 409 students on how to use SPSS to solve many of the problems covered in the D Agostino

More information

Log-linear Models for Contingency Tables

Log-linear Models for Contingency Tables Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A

More information

Statistical Techniques II EXST7015 Simple Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression Statistical Techniques II EXST7015 Simple Linear Regression 03a_SLR 1 Y - the dependent variable 35 30 25 The objective Given points plotted on two coordinates, Y and X, find the best line to fit the data.

More information

Cohen s s Kappa and Log-linear Models

Cohen s s Kappa and Log-linear Models Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance

More information

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO

More information

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015 Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

Bivariate Regression Analysis. The most useful means of discerning causality and significance of variables

Bivariate Regression Analysis. The most useful means of discerning causality and significance of variables Bivariate Regression Analysis The most useful means of discerning causality and significance of variables Purpose of Regression Analysis Test causal hypotheses Make predictions from samples of data Derive

More information

Foundations of Correlation and Regression

Foundations of Correlation and Regression BWH - Biostatistics Intermediate Biostatistics for Medical Researchers Robert Goldman Professor of Statistics Simmons College Foundations of Correlation and Regression Tuesday, March 7, 2017 March 7 Foundations

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?) 12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Z-test χ 2 -test Confidence Interval Sample size and power Relative effect

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

As always, show your work and follow the HW format. You may use Excel, but must show sample calculations.

As always, show your work and follow the HW format. You may use Excel, but must show sample calculations. As always, show your work and follow the HW format. You may use Excel, but must show sample calculations. 1. Single Mean. A new roof truss is designed to hold more than 5000 pounds of snow load. You test

More information

Regression Analysis. BUS 735: Business Decision Making and Research

Regression Analysis. BUS 735: Business Decision Making and Research Regression Analysis BUS 735: Business Decision Making and Research 1 Goals and Agenda Goals of this section Specific goals Learn how to detect relationships between ordinal and categorical variables. Learn

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population

More information

Topic 10 - Linear Regression

Topic 10 - Linear Regression Topic 10 - Linear Regression Least squares principle Hypothesis tests/confidence intervals/prediction intervals for regression 1 Linear Regression How much should you pay for a house? Would you consider

More information

Statistics for exp. medical researchers Regression and Correlation

Statistics for exp. medical researchers Regression and Correlation Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators Multiple Regression Relating a response (dependent, input) y to a set of explanatory (independent, output, predictor) variables x, x 2, x 3,, x q. A technique for modeling the relationship between variables.

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 2013

QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 2013 QUANTITATIVE STATISTICAL METHODS: REGRESSION AND FORECASTING JOHANNES LEDOLTER VIENNA UNIVERSITY OF ECONOMICS AND BUSINESS ADMINISTRATION SPRING 3 Introduction Objectives of course: Regression and Forecasting

More information

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know: Multiple Regression Ψ320 Ainsworth More Hypothesis Testing What we really want to know: Is the relationship in the population we have selected between X & Y strong enough that we can use the relationship

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

( ), which of the coefficients would end

( ), which of the coefficients would end Discussion Sheet 29.7.9 Qualitative Variables We have devoted most of our attention in multiple regression to quantitative or numerical variables. MR models can become more useful and complex when we consider

More information

Correlation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0

Correlation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0 Correlation Bivariate normal densities with ρ 0 Example: Obesity index and blood pressure of n people randomly chosen from a population Two-dimensional / bivariate normal density with correlation 0 Correlation?

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F). STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent

More information