cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )
|
|
- Angela Shelton
- 5 years ago
- Views:
Transcription
1 Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation coefficient ranges from 1 (perfect negative linear association) to 1 (perfect positive linear association). An r-value of 0 indicates no association between the two variables. Correlation does not indicate causation. To obtain a correlation coefficient (r) for two variables, use the cor() command. Pearson s correlation is the default method designated with the method= argument. Kendall ( kendall ) and Spearman ( spearman ) are also available. Pearson method is intended for use on normally distributed data (parametric). Spearman and Kendall are non-parametric alternatives. cor(datavector1, datavector2, method= pearson ) cor(dataset$measurement1, dataset$measurement2, method= pearson ) In order to obtain significance values for correlation coefficients, use the cor.test() command. Pearson s correlation is the default method; Kendall ( kendall ) and Spearman ( spearman ) are also available. cor.test(datavector1, datavector2, method= pearson ) cor.test(dataset$measurement1, dataset$measurement2, method= pearson ) Linear Regression Used to find a best fit line through the data that can be used to predict one variable based on another. Data inputs can be either vectors or columns from a data frame. The response variable (dependent variable; Y-axis) is listed first, then the explanatory variable (independent variable; X-axis). Linear regression between two variables requires the creation of a linear model: lm(responsevector~explanatoryvector) lm(dataset$response~dataset$explanatory) 1
2 Use the summary command to get more information about the linear model: summary(lm(responsevector~explanatoryvector)) summary(lm(dataset$response~dataset$explanatory)) You can also assign the linear model to a name and run the summary command on that. model1=lm(responsevector~explanatoryvector)) summary(model1) model2=lm(dataset$response~dataset$explanatory)) summary(model2) Look for the p-value and adjusted R-squared value in the output (highlighted red in the example below): Call: lm(formula = example$richness ~ example$diversity) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) * example$diversity e-07 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 30 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 30 DF, p-value: 1.278e-07 The strength of the relationship between the two variables is expressed as an R 2 value. This value tells us what percentage of the variation in the response variable (dependent variable; Y-axis) is explained by the explanatory variable (independent variable; X-axis). R 2 values range from 0 (no variation explained) to 1 (variation explained perfectly, 100%). The P-value tells whether there is a significant relationship between these variables. If less than 0.05, you can be fairly sure the explanatory variable is influencing the response variable in a meaningful way. 2
3 Adding a Trendline to a Scatterplot Regression and correlation data are usually presented as a scatterplot with a trendline (best fit line). A trendline can be added to a scatterplot (as created with plot() command) by using the abline() command. The response variable is the Y-axis (dependent) variable, and the explanatory variable is the X-axis (independent) variable. Pay close attention to the order of the variables. The abline() command has arguments for line type (lty=), color (col=), and line width (lwd=). plot(explanatory, response) abline(lm(response~explanatory), lty=1, col= red, lwd=1) plot(dataset$explanatory, dataset$response) abline(lm(dataset$response~dataset$explanatory), lty=1, col= red, lwd=1) Testing for Significant Difference between Slopes The simba package contains a command that will calculate the difference in slope between two lines and test for a significant difference it compares linear regression models. The variables for the first regression model are specified in the x1 (explanatory) and y1 (response) arguments. These two vectors or data frame columns must be the same length (contain same number of values). The variables for the second regression model are specified in the x2 (explanatory) and y2 (response) arguments. These two vectors or data frame columns must be the same length (contain same number of values). Adding the argument ic = TRUE will also calculate the difference in intercept and test for significance. library(simba) diffslope(x1, y1, x2, y2, ic=true) diffslope(dataset$explanatory1, dataset$response1, dataset$explanatory2, dataset$response2, ic=true) 3
4 Quadratic (Polynomial) Regression To create a quadratic regression model, the poly() command is used to indicate that a secondorder polynomial should be used to fit the regression. The number 2 (second argument) designates this as a quadratic function, but you can also use higher order functions. lm(dataset$response~poly(dataset$explanatory, 2, raw=true)) More information about model can be obtained with the summary() command. summary(lm(dataset$response~poly(dataset$explanatory, 2, raw=true))) Again, the model can be assigned to a name for easier use with other commands. fit=lm(dataset$response~poly(dataset$explanatory, 2, raw=true)) summary(fit) To generate a plot showing a polynomial regression line, first generate the scatter plot. plot(dataset$explanatory, dataset$response) Then generate a function for drawing the regression line. Assign the function to a name. pol2<-function(x) fit$coefficient[3]*x^2 + fit$coefficient[2]*x + fit$coefficient[1] Then add the regression line to the scatter plot. The add=true argument adds the result of the curve() command to the previously created plot. curve(pol2, add=true) 4
5 Logistic Regression Used when the response (dependent; Y-axis) variable is binary possible outcomes are 0 or 1 (categorical; like yes/no or dead/alive). The explanatory (independent; X-axis) variable is numerical. Predicts the probability of having a 0 or 1 response based on a given explanatory value. For example, what is the probability of a tree surviving to the next year based on height? Commonly used for survival data obtained from population monitoring. Note the use of the glm() command instead of lm(). You must specify a binomial (logistic) model using the family= argument. The summary() command provides more information. glm(formula=dataset$response~dataset$explanatory, family=binomial) summary(glm(formula=dataset$response~dataset$explanatory, family=binomial)) There is an alternate way to code this, where the dataset itself is specified using the data= argument, which means the response and explanatory variables can be named without the preceeding dataset$ summary(glm(formula=response~explanatory, data=dataset, family=binomial)) In the output, look for the P value, Pr(> z ), in the Coefficients section that corresponds to the response variable ( Richness in this example output). Call: glm(formula = Age ~ Richness, family = binomial, data = example) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) Richness (Dispersion parameter for binomial family taken to be 1) Null deviance: on 31 degrees of freedom Residual deviance: on 30 degrees of freedom AIC:
6 Odds Ratio For a logistic regression model, the odds ratio is a commonly calculated statistic. It indicates that for every unit increase in the explanatory variable, there is an n-fold increase in the probability of getting a particular response. exp(coeff(glm(formula=response~explanatory, data=dataset, family=binomial))) The output value under the response variable ( Richness in this example) is the odds ratio. (Intercept) Richness You can also assign the regression model to a name ( model.name in this example). model.name=glm(formula=response~explanatory, data=dataset, family=binomial) exp(coeff(model.name)) 6
7 Tutorial Code setwd("/users/johndoe/desktop/") example=read.csv("r_example_dataframe.csv") #Correlation cor(example$richness, example$diversity, method="pearson") cor.test(example$richness, example$diversity, method="pearson") cor.test(example$richness, example$diversity, method="spearman") #Linear Regression lm(example$richness~example$diversity) summary(lm(example$richness~example$diversity)) #Regression models can be assigned a name; for example, model1 model1=lm(example$richness~example$diversity) summary(model1) #Make a scatterplot of the data plot(example$diversity, example$richness, ylab="richness", xlab="diversity", ylim=c(0,14), xlim=c(0,10), pch=16, col="blue", cex=1.5, las=1) #Add the trendline to the scatterplot; Note two ways to do this, depending on whether regression model was named or not abline(lm(example$richness~example$diversity)) abline(model1, col="red", lwd=3, lty=2) #Add text to scatterplot with R 2 text(1.5, 13, "R^2 = ") text(1.5, 12, "P = 1.278e-07") and P values #Test for difference in slope between young and old plots #Create old and young subsets of the data young=example[grep("young", example$age),] old=example[grep("old", example$age),] 7
8 #Load the simba package library(simba) #Test whether the linear regressions of Richness vs. Diversity differ between young and old plots; Look for the Significance value in the output diffslope(young$richness, young$diversity, old$richness, old$diversity) #Quadratic Regression quadfit=lm(example$richness~poly(example$diversity, 2, raw=true)) summary(quadfit) #Create a scatterplot of the data plot(example$diversity, example$richness, ylab="richness", xlab="diversity", ylim=c(0,14), xlim=c(0,10), pch=16, col="blue", cex=1.5, las=1) #Specify the quadratic function for the polynomial fit pol2=function(x) quadfit$coefficient[3]*x^2 + quadfit$coefficient[2]*x + quadfit$coefficient[1] #Add the quadratic trendline to the scatterplot curve(pol2, add=true) #Logistic Regression #Specify the binomial regression model glm(formula=age~richness, data=example, family=binomial) summary(glm(formula=age~richness, data=example, family=binomial)) g=glm(formula=age~richness, data=example, family=binomial) summary(g) #Odds Ratio exp(coef(glm(formula=age~richness, data=example, family=binomial))) exp(coef(g)) 8
Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression
Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationSLR output RLS. Refer to slr (code) on the Lecture Page of the class website.
SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationWeek 7 Multiple factors. Ch , Some miscellaneous parts
Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires
More informationRegression and Models with Multiple Factors. Ch. 17, 18
Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least
More informationInteractions in Logistic Regression
Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More informationLinear Probability Model
Linear Probability Model Note on required packages: The following code requires the packages sandwich and lmtest to estimate regression error variance that may change with the explanatory variables. If
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationChapter 3 - Linear Regression
Chapter 3 - Linear Regression Lab Solution 1 Problem 9 First we will read the Auto" data. Note that most datasets referred to in the text are in the R package the authors developed. So we just need to
More informationRegression on Faithful with Section 9.3 content
Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,
More informationLogistic Regression. 0.1 Frogs Dataset
Logistic Regression We move now to the classification problem from the regression problem and study the technique ot logistic regression. The setting for the classification problem is the same as that
More informationGeneralized Linear Models in R
Generalized Linear Models in R NO ORDER Kenneth K. Lopiano, Garvesh Raskutti, Dan Yang last modified 28 4 2013 1 Outline 1. Background and preliminaries 2. Data manipulation and exercises 3. Data structures
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationLeftovers. Morris. University Farm. University Farm. Morris. yield
Leftovers SI 544 Lada Adamic 1 Trellis graphics Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475
More informationR Hints for Chapter 10
R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.
More informationBIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression
BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested
More informationStatistics. Introduction to R for Public Health Researchers. Processing math: 100%
Statistics Introduction to R for Public Health Researchers Statistics Now we are going to cover how to perform a variety of basic statistical tests in R. Correlation T-tests/Rank-sum tests Linear Regression
More informationNonlinear Models. What do you do when you don t have a line? What do you do when you don t have a line? A Quadratic Adventure
What do you do when you don t have a line? Nonlinear Models Spores 0e+00 2e+06 4e+06 6e+06 8e+06 30 40 50 60 70 longevity What do you do when you don t have a line? A Quadratic Adventure 1. If nonlinear
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationUsing R in 200D Luke Sonnet
Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationA Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 7 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, Colonic
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationDiagnostics and Transformations Part 2
Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics
More information11 Regression. Introduction. The Correlation Coefficient. The Least-Squares Regression Line
11 Regression The Correlation Coefficient The Least-Squares Regression Line The Correlation Coefficient Introduction A bivariate data set consists of n, (x 1, y 1 ),, (x n, y n ). A scatterplot is a of
More informationRegression Examples in R
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 09/06/2018 Gifted Children Study An investigator is interested in understanding the relationship, if any, between the analytical skills of
More informationExamples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions.
Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions. David. Boore These examples in this document used R to do the regression. See also Notes_on_piecewise_continuous_regression.doc
More informationCorrelation and regression
Correlation and regression Patrick Breheny December 1, 2016 Today s lab is about correlation and regression. It will be somewhat shorter than some of our other labs, as I would also like to spend some
More informationOn the Inference of the Logistic Regression Model
On the Inference of the Logistic Regression Model 1. Model ln =(; ), i.e. = representing false. The linear form of (;) is entertained, i.e. ((;)) ((;)), where ==1 ;, with 1 representing true, 0 ;= 1+ +
More informationMultiple Linear Regression (solutions to exercises)
Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationNon-Gaussian Response Variables
Non-Gaussian Response Variables What is the Generalized Model Doing? The fixed effects are like the factors in a traditional analysis of variance or linear model The random effects are different A generalized
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationRegression and correlation
6 Regression and correlation The main object of this chapter is to show how to perform basic regression analyses, including plots for model checking and display of confidence and prediction intervals.
More informationAnalysing categorical data using logit models
Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.research-training.net/manchester
More informationBIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users
BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing
More informationRegression Methods for Survey Data
Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear
More informationTento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/
Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationBMI 541/699 Lecture 22
BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based
More informationTruck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation
Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical
More informationStatistical Prediction
Statistical Prediction P.R. Hahn Fall 2017 1 Some terminology The goal is to use data to find a pattern that we can exploit. y: response/outcome/dependent/left-hand-side x: predictor/covariate/feature/independent
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationMODULE 6 LOGISTIC REGRESSION. Module Objectives:
MODULE 6 LOGISTIC REGRESSION Module Objectives: 1. 147 6.1. LOGIT TRANSFORMATION MODULE 6. LOGISTIC REGRESSION Logistic regression models are used when a researcher is investigating the relationship between
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationIntroduction to Statistics and R
Introduction to Statistics and R Mayo-Illinois Computational Genomics Workshop (2018) Ruoqing Zhu, Ph.D. Department of Statistics, UIUC rqzhu@illinois.edu June 18, 2018 Abstract This document is a supplimentary
More informationHands on cusp package tutorial
Hands on cusp package tutorial Raoul P. P. P. Grasman July 29, 2015 1 Introduction The cusp package provides routines for fitting a cusp catastrophe model as suggested by (Cobb, 1978). The full documentation
More informationLogistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationMatched Pair Data. Stat 557 Heike Hofmann
Matched Pair Data Stat 557 Heike Hofmann Outline Marginal Homogeneity - review Binary Response with covariates Ordinal response Symmetric Models Subject-specific vs Marginal Model conditional logistic
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More informationFoundations of Correlation and Regression
BWH - Biostatistics Intermediate Biostatistics for Medical Researchers Robert Goldman Professor of Statistics Simmons College Foundations of Correlation and Regression Tuesday, March 7, 2017 March 7 Foundations
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationCorrelation. A statistics method to measure the relationship between two variables. Three characteristics
Correlation Correlation A statistics method to measure the relationship between two variables Three characteristics Direction of the relationship Form of the relationship Strength/Consistency Direction
More informationHow to deal with non-linear count data? Macro-invertebrates in wetlands
How to deal with non-linear count data? Macro-invertebrates in wetlands In this session we l recognize the advantages of making an effort to better identify the proper error distribution of data and choose
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationBinary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017
Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which
More informationRegression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102
Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More informationStat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov
Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple
More informationIntroduction to logistic regression
Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia What we are going to learn
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon April 6, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 25 Table of contents 1 Building and applying logistic regression models (Chap
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationMath 2311 Written Homework 6 (Sections )
Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.
More informationCorrelation and Regression
Correlation and Regression http://xkcd.com/552/ Review Testing Hypotheses with P-Values Writing Functions Z, T, and χ 2 tests for hypothesis testing Power of different statistical tests using simulation
More informationAge 55 (x = 1) Age < 55 (x = 0)
Logistic Regression with a Single Dichotomous Predictor EXAMPLE: Consider the data in the file CHDcsv Instead of examining the relationship between the continuous variable age and the presence or absence
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.
More informationExercise 5.4 Solution
Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia
More informationIntroduction to the Generalized Linear Model: Logistic regression and Poisson regression
Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Statistical modelling: Theory and practice Gilles Guillot gigu@dtu.dk November 4, 2013 Gilles Guillot (gigu@dtu.dk)
More informationInferences on Linear Combinations of Coefficients
Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you
More informationChapter 4 Describing the Relation between Two Variables
Chapter 4 Describing the Relation between Two Variables 4.1 Scatter Diagrams and Correlation The is the variable whose value can be explained by the value of the or. A is a graph that shows the relationship
More informationssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm
Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.
More informationlm statistics Chris Parrish
lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................
More informationStat 4510/7510 Homework 7
Stat 4510/7510 Due: 1/10. Stat 4510/7510 Homework 7 1. Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details
More informationModule 4: Regression Methods: Concepts and Applications
Module 4: Regression Methods: Concepts and Applications Example Analysis Code Rebecca Hubbard, Mary Lou Thompson July 11-13, 2018 Install R Go to http://cran.rstudio.com/ (http://cran.rstudio.com/) Click
More informationTHE PEARSON CORRELATION COEFFICIENT
CORRELATION Two variables are said to have a relation if knowing the value of one variable gives you information about the likely value of the second variable this is known as a bivariate relation There
More information> modlyq <- lm(ly poly(x,2,raw=true)) > summary(modlyq) Call: lm(formula = ly poly(x, 2, raw = TRUE))
School of Mathematical Sciences MTH5120 Statistical Modelling I Tutorial 4 Solutions The first two models were looked at last week and both had flaws. The output for the third model with log y and a quadratic
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationStatistics 203 Introduction to Regression Models and ANOVA Practice Exam
Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10
More informationVarious Issues in Fitting Contingency Tables
Various Issues in Fitting Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Complete Tables with Zero Entries In contingency tables, it is possible to have zero entries in a
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More informationLinear regression and correlation
Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationRegression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.
Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression
More informationIntroduction to Statistical Data Analysis Lecture 8: Correlation and Simple Regression
Introduction to Statistical Data Analysis Lecture 8: and James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis 1 / 40 Introduction
More informationA Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46
A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response
More information