An Analysis. Jane Doe Department of Biostatistics Vanderbilt University School of Medicine. March 19, Descriptive Statistics 1
|
|
- Edward Palmer
- 5 years ago
- Views:
Transcription
1 An Analysis Jane Doe Department of Biostatistics Vanderbilt University School of Medicine March 19, 211 Contents 1 Descriptive Statistics 1 2 Redundancy Analysis and Variable Interrelationships 2 3 Logistic Regression Model 3 4 Test Calculations 5 5 Computing Environment 6 1 Descriptive Statistics gethdata ( support ) # Use Hmisc / gethdata to get dataset from VU DataSets wiki d s u b s e t ( support, s e l e c t=c ( age, sex, race, edu, income, hospdead, s l o s, dzgroup, meanbp, hrt ) ) l a t e x ( d e s c r i b e ( d ), f i l e = ' ' ) d 1 Variables 1 Observations age : Age n 1 missing unique 97 Mean lowest : highest: sex n missing unique 1 2 female (438, 44%), male (562, 56%) race n missing unique white black asian other hispanic Frequency % edu : Years of Education n missing unique Mean lowest : , highest: income n missing unique under $11k (39, 47%), $11-$25k (161, 25%), $25-$5k (16, 16%) >$5k (75, 12%) 1
2 2 REDUNDANCY ANALYSIS AND VARIABLE INTERRELATIONSHIPS hospdead : Death in Hospital n missing unique Sum Mean slos : Days from Study Entry to Discharge n missing unique Mean lowest : , highest: dzgroup n 1 missing unique 8 ARF/MOSF w/sepsis COPD CHF Cirrhosis Coma Colon Cancer Lung Cancer Frequency % MOSF w/malig Frequency 86 % 9 meanbp : Mean Arterial Blood Pressure Day 3 n missing unique Mean lowest : , highest: hrt : Heart Rate Day 3 n missing unique Mean lowest : , highest: Race is reduced to three levels (white, black, OTHER) because of low frequencies in other levels (minimum relative frequency set to.5). d updata ( d, r a c e = c o m b i n e. l e v e l s ( race, minlev =. 5 ) ) Input o b j e c t s i z e : 17 bytes ; 1 v a r i a b l e s Modified v a r i a b l e r a c e New o b j e c t s i z e : 1688 bytes ; 1 v a r i a b l e s 2 Redundancy Analysis and Variable Interrelationships v v a r c l u s (., data=d ) p l o t ( v ) redun ( age+sex+r a c e+edu+income+dzgroup+meanbp+hrt, data=d ) Redundancy A n a l y s i s redun ( formula = age + sex + r a c e + edu + income + dzgroup + meanbp + hrt, data = d ) n : 617 p : 8 nk : 3 Number o f NAs : 383 F r e q u e n c i e s o f Missing Values Due to Each V a r i a b l e age sex r a c e edu income dzgroup meanbp hrt Transformation o f t a r g e t v a r i a b l e s f o r c e d to be l i n e a r R 2 c u t o f f :. 9 Type : o r d i n a r y R 2 with which each v a r i a b l e can be p r e d i c t e d from a l l o t h e r v a r i a b l e s : age sex r a c e edu income dzgroup meanbp hrt No redundant v a r i a b l e s 2
3 3 LOGISTIC REGRESSION MODEL # Alternative : redun (., data = subset (d, select = -c ( hospdead, slos ))) Spearman ρ meanbp hospdead dzgroupcoma dzgroupcopd dzgroupmosf w/malig sexmale age hrt dzgroupcirrhosis dzgroupcolon Cancer dzgrouplung Cancer slos dzgroupchf income$11 $25k income$25 $5k racewhite raceblack edu income>$5k Note that the clustering of black with white is not interesting; this just means that these are mutually exclusive higher frequency categories, causing them to be negatively correlated. 3 Logistic Regression Model Here we fit a tentative binary logistic regression model. The coefficients are not very useful so they are not printed. Note: the symbolic section reference below was created by the following R comment: # see Section (*\ref{descstats}*) for descriptive statistics The label was defined in an earlier section using \section{descriptive Statistics}\label{descStats} dd d a t a d i s t ( d ) ; o p t i o n s ( d a t a d i s t= ' dd ' ) f lrm ( hospdead r c s ( age, 4 ) + sex + r a c e + dzgroup + r c s ( meanbp, 5 ), data=d ) # see Section 1 for descriptive statistics p r i n t ( f, l a t e x=true, c o e f s=false) 3
4 3 LOGISTIC REGRESSION MODEL Logistic Regression Model lrm(formula = hospdead ~ rcs(age, 4) + sex + race + dzgroup + rcs(meanbp, 5), data = d) Frequencies of Missing Values Due to Each Variable hospdead age sex race dzgroup meanbp 5 Model Likelihood Discrimination Rank Discrim. Ratio Test Indexes Indexes Obs 995 LR χ R C d.f. 17 g 1.65 D xy Pr(> χ 2 ) <.1 g r 4.98 γ.62 max deriv g p.228 τ a.227 Brier.144 l a t e x ( anova ( f ), where= ' h ', f i l e = ' ' ) # can also try where =' htbp ' Table 1: Wald Statistics for hospdead χ 2 d.f. P age Nonlinear sex race dzgroup <.1 meanbp <.1 Nonlinear <.1 TOTAL NONLINEAR <.1 TOTAL <.1 p r i n t ( p l o t ( P r e d i c t ( f ) ) ) 4
5 4 TEST CALCULATIONS race sex log odds OTHER white black age female dzgroup male meanbp ARwCOPCHFCrrComClCLnCMOw Test Calculations x 3 ; y 2 i f ( x y ) ' t h i s ' e l s e ' that ' [ 1 ] that i f ( y x ) ' that ' e l s e ' t h i s ' [ 1 ] t h i s x y [ 1 ] 9 p l o t ( r u n i f ( 2 ), r u n i f ( 2 ) ) 5
6 REFERENCES runif(2) runif(2) 5 Computing Environment These analyses were done using the following versions of R 1, the operating system, and add-on packages Hmisc 2, rms 3, and others: ˆ R version ( ), x86_64-pc-linux-gnu ˆ Base packages: base, datasets, graphics, grdevices, grid, methods, splines, stats, utils ˆ Other packages: Hmisc 3.8-3, lattice.19-17, rms 3.3-, survival ˆ Loaded via a namespace (and not attached): cluster References [1] R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 21. ISBN , available from [2] Frank E. Harrell. Hmisc: A package of miscellaneous S functions. Available from biostat.mc.vanderbilt. edu/s/hmisc, 211. [3] Frank E. Harrell. rms: S functions for biostatistical/epidemiologic modeling, testing, estimation, validation, graphics, prediction, and typesetting by storing enhanced model design attributes in the fit. Available from biostat.mc.vanderbilt.edu/rms, 211. Implements methods in Regression Modeling Strategies, New York:Springer, 21. 6
Example Enhanced Report
Example Enhanced Report Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine October 27, 2014 Changes for 2014-10-02 are marked by Changes for 2014-10-04 are marked by
More informationHow to Present Results of Regression Models to Clinicians
How to Present Results of Regression Models to Clinicians Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine f.harrell@vanderbilt.edu biostat.mc.vanderbilt.edu/fhhandouts
More informationBIOS 312: MODERN REGRESSION ANALYSIS
BIOS 312: MODERN REGRESSION ANALYSIS James C (Chris) Slaughter Department of Biostatistics Vanderbilt University School of Medicine james.c.slaughter@vanderbilt.edu biostat.mc.vanderbilt.edu/coursebios312
More informationTopic 9: Canonical Correlation
Topic 9: Canonical Correlation Ying Li Stockholm University October 22, 2012 1/19 Basic Concepts Objectives In canonical correlation analysis, we examine the linear relationship between a set of X variables
More informationGenerating Private Synthetic Data: Presentation 2
Generating Private Synthetic Data: Presentation 2 Mentor: Dr. Anand Sarwate July 17, 2015 Overview 1 Project Overview (Revisited) 2 Sensitivity of Mutual Information 3 Simulation 4 Results with Real Data
More informationCGEN(Case-control.GENetics) Package
CGEN(Case-control.GENetics) Package October 30, 2018 > library(cgen) Example of snp.logistic Load the ovarian cancer data and print the first 5 rows. > data(xdata, package="cgen") > Xdata[1:5, ] id case.control
More informationFollow-up data with the Epi package
Follow-up data with the Epi package Summer 2014 Michael Hills Martyn Plummer Bendix Carstensen Retired Highgate, London International Agency for Research on Cancer, Lyon plummer@iarc.fr Steno Diabetes
More informationIntroduction to logistic regression
Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia What we are going to learn
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationR-companion to: Estimation of the Thurstonian model for the 2-AC protocol
R-companion to: Estimation of the Thurstonian model for the 2-AC protocol Rune Haubo Bojesen Christensen, Hye-Seong Lee & Per Bruun Brockhoff August 24, 2017 This document describes how the examples in
More informationLogistic regression: Miscellaneous topics
Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio
More informationβ j = coefficient of x j in the model; β = ( β1, β2,
Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationConditional variable importance in R package extendedforest
Conditional variable importance in R package extendedforest Stephen J. Smith, Nick Ellis, C. Roland Pitcher February 10, 2011 Contents 1 Introduction 1 2 Methods 2 2.1 Conditional permutation................................
More informationMultiple Regression: Chapter 13. July 24, 2015
Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)
More informationCLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition
CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationProcessing microarray data with Bioconductor
Processing microarray data with Bioconductor Statistical analysis of gene expression data with R and Bioconductor University of Copenhagen Copenhagen Biocenter Laurent Gautier 1, 2 August 17-21 2009 Contents
More informationDiscrete Multivariate Statistics
Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are
More informationA Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R 2nd Edition Brian S. Everitt and Torsten Hothorn CHAPTER 7 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, Colonic
More informationSelection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions
in Multivariable Analysis: Current Issues and Future Directions Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine STRATOS Banff Alberta 2016-07-04 Fractional polynomials,
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationNemours Biomedical Research Statistics Course. Li Xie Nemours Biostatistics Core October 14, 2014
Nemours Biomedical Research Statistics Course Li Xie Nemours Biostatistics Core October 14, 2014 Outline Recap Introduction to Logistic Regression Recap Descriptive statistics Variable type Example of
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationLogistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction
More informationˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T.
Exam 3 Review Suppose that X i = x =(x 1,, x k ) T is observed and that Y i X i = x i independent Binomial(n i,π(x i )) for i =1,, N where ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T x) This is called the
More informationsamplesizelogisticcasecontrol Package
samplesizelogisticcasecontrol Package January 31, 2017 > library(samplesizelogisticcasecontrol) Random data generation functions Let X 1 and X 2 be two variables with a bivariate normal ditribution with
More informationMethodological challenges in research on consequences of sickness absence and disability pension?
Methodological challenges in research on consequences of sickness absence and disability pension? Prof., PhD Hjelt Institute, University of Helsinki 2 Two methodological approaches Lexis diagrams and Poisson
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationINTRODUCTION TO BIOSTATISTICS FOR BIOMEDICAL RESEARCH
INTRODUCTION TO BIOSTATISTICS FOR BIOMEDICAL RESEARCH Frank E Harrell Jr James C Slaughter Department of Biostatistics Vanderbilt University School of Medicine f.harrell@vanderbilt.edu james.c.slaughter@vanderbilt.edu
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationfishr Vignette - Age-Length Keys to Assign Age from Lengths
fishr Vignette - Age-Length Keys to Assign Age from Lengths Dr. Derek Ogle, Northland College December 16, 2013 The assessment of ages for a large number of fish is very time-consuming, whereas measuring
More informationTAMS38 Experimental Design and Biostatistics, 4 p / 6 hp Examination on 19 April 2017, 8 12
Kurskod: TAMS38 - Provkod: TEN1 TAMS38 Experimental Design and Biostatistics, 4 p / 6 hp Examination on 19 April 2017, 8 12 The collection of the formulas in mathematical statistics prepared by Department
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationMultivariable Fractional Polynomials
Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example
More informationLinear Regression Analysis for Survey Data. Professor Ron Fricker Naval Postgraduate School Monterey, California
Linear Regression Analysis for Survey Data Professor Ron Fricker Naval Postgraduate School Monterey, California 1 Goals for this Lecture Linear regression How to think about it for Lickert scale dependent
More informationIntroduction to mtm: An R Package for Marginalized Transition Models
Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition
More informationSociology 593 Exam 1 February 14, 1997
Sociology 9 Exam February, 997 I. True-False. ( points) Indicate whether the following statements are true or false. If false, briefly explain why.. There are IVs in a multiple regression model. If the
More informationLecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationPractical Biostatistics
Practical Biostatistics Clinical Epidemiology, Biostatistics and Bioinformatics AMC Multivariable regression Day 5 Recap Describing association: Correlation Parametric technique: Pearson (PMCC) Non-parametric:
More informationRoutines for fitting kinetic models to chemical degradation data
kinfit - Routines for fitting kinetic models to chemical degradation data Johannes Ranke Product Safety Harlan Laboratories Ltd. Zelgliweg 1, CH 4452 Itingen, Switzerland April 14, 2010 Abstract In the
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationIntroduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data
Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington
More informationThree-Way Contingency Tables
Newsom PSY 50/60 Categorical Data Analysis, Fall 06 Three-Way Contingency Tables Three-way contingency tables involve three binary or categorical variables. I will stick mostly to the binary case to keep
More informationLecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018
Introduction Statistics 211 - Statistical Methods II Presented January 8, 2018 linear models Dan Gillen Department of Statistics University of California, Irvine 1.1 Logistics and Contact Information Lectures:
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationJun Tu. Department of Geography and Anthropology Kennesaw State University
Examining Spatially Varying Relationships between Preterm Births and Ambient Air Pollution in Georgia using Geographically Weighted Logistic Regression Jun Tu Department of Geography and Anthropology Kennesaw
More informationCorrelation and Regression
Correlation and Regression 1 Overview Introduction Scatter Plots Correlation Regression Coefficient of Determination 2 Objectives of the topic 1. Draw a scatter plot for a set of ordered pairs. 2. Compute
More informationMultivariable Fractional Polynomials
Multivariable Fractional Polynomials Axel Benner May 17, 2007 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example
More informationTables and Figures to Accompany
Tables and Figures to Accompany CORRELATES OF BELIEF IN REINCARNATION AMONG CHRISTIAN WORSHIPERS John P. Marcum Presbyterian Church (U.S.A.) jmarcum@ctr.pcusa.org Paper presented at the annual meeting
More informationRelative-risk regression and model diagnostics. 16 November, 2015
Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationExercise 2 SISG Association Mapping
Exercise 2 SISG Association Mapping Load the bpdata.csv data file into your R session. LHON.txt data file into your R session. Can read the data directly from the website if your computer is connected
More informationLogistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates
Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates WI-ATSA June 2-3, 2016 Overview Brief description of logistic
More informationMULTINOMIAL LOGISTIC REGRESSION
MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationBIOSTATS Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS
BIOSTATS 640 - Intermediate Biostatistics Spring 2017 Exam 2 (Units 3, 4 & 5) Practice Problems SOLUTIONS Practice Question 1 Both the Binomial and Poisson distributions have been used to model the quantal
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationModel selection and comparison
Model selection and comparison an example with package Countr Tarak Kharrat 1 and Georgi N. Boshnakov 2 1 Salford Business School, University of Salford, UK. 2 School of Mathematics, University of Manchester,
More informationLet s see if we can predict whether a student returns or does not return to St. Ambrose for their second year.
Assignment #13: GLM Scenario: Over the past few years, our first-to-second year retention rate has ranged from 77-80%. In other words, 77-80% of our first-year students come back to St. Ambrose for their
More informationCh 7: Dummy (binary, indicator) variables
Ch 7: Dummy (binary, indicator) variables :Examples Dummy variable are used to indicate the presence or absence of a characteristic. For example, define female i 1 if obs i is female 0 otherwise or male
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables
ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationy i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x
Question 1 Suppose that we have data Let x 1 n x i px 1, y 1 q,..., px n, y n q. ȳ 1 n y i s 2 X 1 n px i xq 2 Throughout this question, we assume that the simple linear model is correct. We also assume
More informationA course in statistical modelling. session 09: Modelling count variables
A Course in Statistical Modelling SEED PGR methodology training December 08, 2015: 12 2pm session 09: Modelling count variables Graeme.Hutcheson@manchester.ac.uk blackboard: RSCH80000 SEED PGR Research
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 6-Logistic Regression for Case-Control Studies Outlines: 1. Biomedical Designs 2. Logistic Regression Models for Case-Control Studies 3. Logistic
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationLinear Regression With Special Variables
Linear Regression With Special Variables Junhui Qian December 21, 2014 Outline Standardized Scores Quadratic Terms Interaction Terms Binary Explanatory Variables Binary Choice Models Standardized Scores:
More informationRegression Modeling Strategies
Regression Modeling Strategies Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine Nashville TN 37232 USA f.harrell@vanderbilt.edu biostat.mc.vanderbilt.edu/rms Questions/discussions:
More informationPropensity Score Matching
Propensity Score Matching This notebook illustrates how to do propensity score matching in Python. Original dataset available at: http://biostat.mc.vanderbilt.edu/wiki/main/datasets (http://biostat.mc.vanderbilt.edu/wiki/main/datasets)
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationSession 3 The proportional odds model and the Mann-Whitney test
Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationLogistic Regression. 1 Analysis of the budworm moth data 1. 2 Estimates and confidence intervals for the parameters 2
Logistic Regression Ulrich Halekoh, Jørgen Vinslov Hansen, Søren Højsgaard Biometry Research Unit Danish Institute of Agricultural Sciences March 31, 2006 Contents 1 Analysis of the budworm moth data 1
More informationDEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE
Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures
More informationPackage HGLMMM for Hierarchical Generalized Linear Models
Package HGLMMM for Hierarchical Generalized Linear Models Marek Molas Emmanuel Lesaffre Erasmus MC Erasmus Universiteit - Rotterdam The Netherlands ERASMUSMC - Biostatistics 20-04-2010 1 / 52 Outline General
More informationAnalysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome
Analysis of MALDI-TOF Data: from Data Preprocessing to Model Validation for Survival Outcome Heidi Chen, Ph.D. Cancer Biostatistics Center Vanderbilt University School of Medicine March 20, 2009 Outline
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationR-squared for Bayesian regression models
R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the
More informationA User's Guide To Principal Components
A User's Guide To Principal Components J. EDWARD JACKSON A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Brisbane Toronto Singapore Contents Preface Introduction 1. Getting
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationTextbook Examples of. SPSS Procedure
Textbook s of IBM SPSS Procedures Each SPSS procedure listed below has its own section in the textbook. These sections include a purpose statement that describes the statistical test, identification of
More informationLogistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy
Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided
More informationRelative Risk Calculations in R
Relative Risk Calculations in R Robert E. Wheeler ECHIP, Inc. October 20, 2009 Abstract This package computes relative risks for both prospective and retrospective samples using GLM and Logit-Log transformations.
More informationECON Interactions and Dummies
ECON 351 - Interactions and Dummies Maggie Jones 1 / 25 Readings Chapter 6: Section on Models with Interaction Terms Chapter 7: Full Chapter 2 / 25 Interaction Terms with Continuous Variables In some regressions
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More information