Models for Count and Binary Data. Poisson and Logistic GWR Models. 24/07/2008 GWR Workshop 1
|
|
- Ilene Edwards
- 6 years ago
- Views:
Transcription
1 Models for Count and Binary Data Poisson and Logistic GWR Models 24/07/2008 GWR Workshop 1
2 Outline I: Modelling counts Poisson regression II: Modelling binary events Logistic Regression III: Poisson Regression in GWR premature mortality in Tokyo IV: Logistic Regression in GWR landslides in Clearwater National Forest, Idaho 24/07/2008 2
3 Background Standard GWR uses OLS (Ordinary Least Squares) methods. These are not always the best option. OLS assumes a Normal (Gaussian) error term 24/07/2008 3
4 Non-Normality (1) Count data need a model form that cannot predict negative values! Poisson cases of an illness sightings of a rare animal number of crimes Number of earth tremors 24/07/2008 4
5 Non-Normality (2) Dichotomous (Binary; Yes/No; 1/0) outcomes need a model form that predicts the probability that an observation is 1 hence must give probabilities between 0-1 Logistic or Binary Logit Does an individual have a disease or not? Is a house detached or not? Was a crime committed at this location within the past week? 24/07/2008 5
6 I: Models for Count Data Poisson Regression 24/07/2008 6
7 The Poisson Model Lamba is the expected count of objects given the conditions at location i The betas are regression coefficients The x s are predictor variables Note that lambda will always be greater than or equal to zero 24/07/2008 7
8 Offsets Often we have a population at risk for count data For example, population susceptible to a disease Number of households (for rates of household burglaries) For zone-based data, this quantity changes from zone to zone and we need to allow for this in our model We do this by using an offset 24/07/2008 8
9 Adding the Offset P i is a variable (not a parameter to calibrate) It represents the population at risk Calibration of betas is by an iterative process it takes longer! 24/07/2008 9
10 Example Household burglary counts for 43 Police Forces in UK (the y i s) Offset = no. of households (P i ) Predictors - these are the x ij s Population Density (persons/sqkm) Unemployed males aged as % of total population 24/07/
11 Results Estimate Std. Error z value P - value (Intercept) e <0.001 Youngunemp 9.601e e <0.001 Density 1.884e e < /07/
12 Question Why are z-values so high? Sometimes due to variation in the counts being more than expected for a Poisson distribution Maybe this is because the parameters vary over space? 24/07/
13 Geographically Weighted Poisson Regression The above is still a global model Question do the same relationships hold everywhere? Is the linkage between unemployed young males and burglary the same everywhere? We need to extend the previous model to a geographically weighted version 24/07/
14 Geographically Weighted Poisson Model Where u i and v i are the coordinates of observation i 24/07/
15 Results for Burglary Data 24/07/
16 II: Models for binary data Logistic Regression 24/07/
17 Logistic Models pr( y i = 1) = p i pr( y i = 0) = 1 p i Here, p i is the mean of the distribution for a dichotomous y i where y i is the dependent variable We need to pay attention to p i -this depends on the explanatory variables How does p i depend on (x 1i,x 2i, x mi )? 24/07/
18 The logistic model p i = i logit( β + β x + β x i ) where logit( z) = exp( z) 1+ exp( z) 24/07/
19 Graph of Logit function logit(x) x 24/07/
20 Alternative form p log( i ) = β + β x + β x i 2 2i 1 p i... where the left hand side of the equation is the log odds for y i = 1 24/07/
21 Interpreting the parameters As with Poisson, parameters make more sense if you take antilogs: we can write pi 1 p i = exp( β )exp( β x )exp( β x 0 1 1i 2 2i )... Each exp(β j ) gives a multiplicative factor for the odds that y i =1 when the corresponding predictor increases by 1 unit. Note multiplicative factor is for the ODDS not the PROBABILITY 24/07/
22 An Example: Housing in the UK Dependent variable: Does a house have more than 1 bathroom Dichotomous (binary) value Independent variable: Floor Area (sq. m) 24/07/
23 Results Estimate Std. Error z value Pr(> z ) Intercept <0.001 FloorArea <0.001 Exp(beta) as % Increase Pr(> z ) FloorArea < /07/
24 24/07/ Geographically Weighted Logistic Regression Where (u i,v i ) are the coordinates of observation i Note as before that the betas are functions not coefficients to be estimated using non-parametric methods i.e. β 0 (u,v) and so on... ), ( ), ( ), ( ) 1 log( = i i i i i i i i i i x v u x v u v u p p β β β
25 24/07/
26 Interpretation 2nd Bathrooms more likely in houses in South Wales and southern England General North/South effect Perhaps because property is more expensive in the south of the UK people tend to add second bathrooms to smaller houses 24/07/
27 Issues Convergence problems If there are areas dominated by y i s all equal to zero or one Take care with automatic bandwidth selection Possibly best policy is trial and error with manual bandwidth control in some cases 24/07/
28 Poisson and Logistic Regression in GWR 24/07/2008 GWR Workshop 28
29 III: Poisson Regression premature mortality in Toyko IV: Logistic Regression landslides in Clearwater National Forest, Idaho 24/07/
30 III Poisson GW Regression 24/07/
31 Poisson Regression To the user this looks as if it is implemented in GWR in almost the same way as ordinary Gaussian regression. The dependent variable must be a count variable (i.e. the values must be integers [whole numbers]) 24/07/
32 Rates: Count & Offsets If you wish to model counts which relate to areas with a varying underlying population you can do this with an offset variable y = ne X β (n is the offset) 24/07/
33 The Offset Variable To keep the Model Editor simple, a variable entered as the weight variable in Poisson regression is treated as the offset 24/07/
34 Outputs These are the similar to those which are obtained for ordinary Gaussian regression The interpretation of the parameters is slightly different 24/07/
35 The data We will use data for 261 municipalities in Tokyo Metropolitan Area We are considering determinants of premature mortality The independent variables are proportions of the elderly, professionals, home owners, and unemployed 24/07/
36 Offset The premature mortality count will obviously vary according to the size of the zone As an offset, we will use the expected number of premature deaths 24/07/
37 24/07/
38 24/07/
39 24/07/
40 24/07/
41 24/07/
42 24/07/
43 The data The data are the folder \SampleData\Tokyo There are shapefiles for both the municipality and prefecture boundaries 24/07/
44 The data The are some dbase files with socioeconomic data There is a data file for GWR3 There is a table with information on the various geographies we will need this for mapping the GWR results 24/07/
45 The model editor After you have selected the data you complete the model editor thus Notice the offset variable as the weight 24/07/
46 Patience The Poisson model is fitted used a method known as iteratively reweighed least squares This takes about 5 times as long as fitting a Gaussian model 24/07/
47 Header *************************************************************** * * * GEOGRAPHICALLY WEIGHTED POISSON REGRESSION * * * *************************************************************** Number of data cases read: 262 Sample data file read... *Number of observations, nobs= 262 *Number of predictors, nvar= 4 Observation Easting extent: Observation Northing extent: /07/
48 Calibration *Finding bandwidth using all regression points This can take some time... *Calibration will be based on 262 cases *Adaptive kernel sample size limits: *Crossvalidation begins... Bandwidth CV Score ** Convergence after 9 function calls ** Convergence: Local Sample Size= 79 24/07/
49 Global Model ********** Global Poisson Model Diagnostics ********** Convergence after 3 iterations Log-likelihood: Deviance (-2LogLikelihood): Trace of the Hat Matrix: Number of parameters in model: Akaike Information Criterion: Corrected AIC (AICc) Bayesian Information Criterion: Parameter Estimate Std Err T Exp(B) Sd(Exp(B)) Intercept Professl Elderly OwnHome Unemply /07/
50 Local Model ********** Local Poisson Model Diagnostics ********** Log Likelihood: Deviance: Trace of the hat matrix Residual sum of squares Effective number of parameters Akaike Information Criterion Corrected AIC Bayesian Information Criterion /07/
51 5-number summaries ********************************************************** * PARAMETER 5-NUMBER SUMMARIES * ********************************************************** Label Minimum Lwr Quartile Median Upr Quartile Maximum Intrcept Professl Elderly OwnHome Unemply /07/
52 Mapping The municipality and data used for the GWR model are in two different map projections However there is a lookup table to link the two attribute tables this time we do not use a spatial join 24/07/
53 GeogIndex.csv The lookup table is a comma separated variable file The ID field contains the IDs used in the municipality shapefile The GWR_ID field contains the sequence numbers assigned by GWR3 24/07/
54 TMABSU attributes 24/07/
55 GeogIndex.csv 24/07/
56 Tokyo point attributes 24/07/
57 24/07/
58 The result You can now map parameter values, and the other diagnostics Professionals 24/07/
59 IV Logistic GW Regression 24/07/
60 Logistic Regression You use logistic regression when your dependent variable is binary or dichotomous the values should be 0 or 1 The independents can be continuous or binary valued dummy variables 24/07/
61 Predicted π The predicted value is the probability that the dependent variable is 1 It is continuous, lying between 0 and 1 24/07/
62 Landslides In November 1995 and February 1996 there were some 865 landslides in Clearwater National Forest, Idaho Gorsesvki et al (TGIS, 10(3) ) suggested that topographic factors may be an influence of landslide occurrence 24/07/
63 The Data We have extracted data for a subset of 239 observations 138 landslide sites and 101 control sites 63
64 The Data Our dataset also contains some topographic indicators for sites in Clearwater National Forest where there have been landslides A similar number of locations have been chosen randomly in the study area where there have not been landslides the control sites A binary variable, Landslid, indicates whether the sample is a landslide or a control site coded 1 or 0 respectively 24/07/
65 Landslide sites are yellow, control sites are black 65
66 Topographic variables The variables have been mostly generated from a digital elevation model This was created from Shuttle Radar Topography Mission data The grid size is 25m 24/07/
67 Predictor Variables Elevation (metres) Slope (% - 0=flat, 10=vertical) Sine of the aspect ( ) Cosine of the aspect ( ) Absolute deviation from due south ( degrees) Distance to the nearest watercourse (metres) 24/07/
68 The data file The data file is named landslides.csv and is in the \SampleData\Clearwater folder There is also a georeferenced scanned map of the study area called basemap.jpg 24/07/
69 Input and Output Files 24/07/
70 Model Editor For the first model use Landslid as the Dependent Variable, and Elev and Slope as the Independent Variables. Use an Adaptive kernel we have Cartesian coordinates 24/07/
71 Running the Model As with Poisson Regression, the model is fitted using iteratively reweighted least squares This means it will take a little longer to run than an ordinary GWR, so be patient 24/07/
72 Control and Listing Files 24/07/
73 Header *************************************************************** * * * GEOGRAPHICALLY WEIGHTED LOGISTIC REGRESSION * * * *************************************************************** Number of data cases read: 239 Sample data file read... *Number of observations, nobs= 239 *Number of predictors, nvar= 2 P(Landslid=1 X) = Observation Easting extent: Observation Northing extent: We have 239 observations with 2 predictors. The study area extent is 33.5km by /07/
74 Calibration *Adaptive kernel sample size limits: *AICc minimisation begins... Bandwidth AICc ** Convergence after 8 function calls ** Convergence: Local Sample Size= 85 This is quite a large bandwidth notice that there is not much variation between the AICs for a range of bandwidths 74
75 Global Model ********** Global Logistic Model Diagnostics ********** Convergence after 5 iterations Log-likelihood: Deviance (-2LogLikelihood): Number of parameters in model: Akaike Information Criterion: Corrected AIC (AICc) Bayesian Information Criterion: Parameter Estimate Std Err T Exp(B) Sd(Exp(B)) Intercept Elev Slope The global AICc is The elevation and slope parameters are both significant higher elevations decrease, and steeper slopes increase, the probability of a landslide 75
76 Local Model ********** Local Logistic Model Diagnostics ********** Log Likelihood: Deviance: Residual sum of squares Effective number of parameters Akaike Information Criterion Corrected AIC Bayesian Information Criterion There is a smaller AICc with the local model so we have some improvement in fit 24/07/
77 5-number summaries ********************************************************** * PARAMETER 5-NUMBER SUMMARIES * ********************************************************** Label Minimum Lwr Quartile Median Upr Quartile Maximum Intrcept Elev Slope Most of the local elevation parameters are negative, and most of the local slope parameters are negative 24/07/
78 Visualising As before, you use ArcMap. The scanned basemap is basemap.jpg The Interchange File must be converted to a coverage Visualisation is as before 24/07/
79 Predicted/Observed The residual is the difference between the observed y (0/1) and the predicted probability of 1 Values from are locations where no landslide occurred but one has been predicted Values from 0.5 1are landslide sites where no landslide has been predicted we might look at these to see what other characteristics they might have which are not in the model 24/07/
80 The default symbology choices for the RESID variable: to change these click on Classify 80
81 The blue bars representing the breaks are removed click to highlight, then right-click to choose Delete Break 81
82 3 breaks at -0.5, 0.5 and the top value 24/07/
83 Small cluster of 3 false negatives near Sheep Mountain Work Center 83
84 End of presentation 24/07/
Using Spatial Statistics Social Service Applications Public Safety and Public Health
Using Spatial Statistics Social Service Applications Public Safety and Public Health Lauren Rosenshein 1 Regression analysis Regression analysis allows you to model, examine, and explore spatial relationships,
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationGeoDa-GWR Results: GeoDa-GWR Output (portion only): Program began at 4/8/2016 4:40:38 PM
New Mexico Health Insurance Coverage, 2009-2013 Exploratory, Ordinary Least Squares, and Geographically Weighted Regression Using GeoDa-GWR, R, and QGIS Larry Spear 4/13/2016 (Draft) A dataset consisting
More informationModeling Spatial Relationships Using Regression Analysis
Esri International User Conference San Diego, California Technical Workshops July 24, 2012 Modeling Spatial Relationships Using Regression Analysis Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS Answering
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationModeling Spatial Relationships using Regression Analysis
Esri International User Conference San Diego, CA Technical Workshops July 2011 Modeling Spatial Relationships using Regression Analysis Lauren M. Scott, PhD Lauren Rosenshein, MS Mark V. Janikas, PhD Answering
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationModeling Spatial Relationships Using Regression Analysis. Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS
Modeling Spatial Relationships Using Regression Analysis Lauren M. Scott, PhD Lauren Rosenshein Bennett, MS Workshop Overview Answering why? questions Introduce regression analysis - What it is and why
More informationSpatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach
Spatial Variation in Infant Mortality with Geographically Weighted Poisson Regression (GWPR) Approach Kristina Pestaria Sinaga, Manuntun Hutahaean 2, Petrus Gea 3 1, 2, 3 University of Sumatera Utara,
More informationECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam
ECLT 5810 Linear Regression and Logistic Regression for Classification Prof. Wai Lam Linear Regression Models Least Squares Input vectors is an attribute / feature / predictor (independent variable) The
More informationESRI 2008 Health GIS Conference
ESRI 2008 Health GIS Conference An Exploration of Geographically Weighted Regression on Spatial Non- Stationarity and Principal Component Extraction of Determinative Information from Robust Datasets A
More informationAnalysing categorical data using logit models
Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.research-training.net/manchester
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationGIS Analysis: Spatial Statistics for Public Health: Lauren M. Scott, PhD; Mark V. Janikas, PhD
Some Slides to Go Along with the Demo Hot spot analysis of average age of death Section B DEMO: Mortality Data Analysis 2 Some Slides to Go Along with the Demo Do Economic Factors Alone Explain Early Death?
More informationGeographically Weighted Regression LECTURE 2 : Introduction to GWR II
Geographically Weighted Regression LECTURE 2 : Introduction to GWR II Stewart.Fotheringham@nuim.ie http://ncg.nuim.ie/gwr A Simulation Experiment Y i = α i + β 1i X 1i + β 2i X 2i Data on X 1 and X 2 drawn
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) GRADUATE DIPLOMA, 2007
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY (formerly the Examinations of the Institute of Statisticians) GRADUATE DIPLOMA, 2007 Applied Statistics I Time Allowed: Three Hours Candidates should answer
More informationIntroduction To Raster Based GIS Dr. Zhang GISC 1421 Fall 2016, 10/19
Introduction To Raster Based GIS Dr. Zhang GISC 1421 Fall 2016, 10/19 Model of the course Using and making maps Navigating GIS maps Map design Working with spatial data Geoprocessing Spatial data infrastructure
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationIntroduction to the Generalized Linear Model: Logistic regression and Poisson regression
Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Statistical modelling: Theory and practice Gilles Guillot gigu@dtu.dk November 4, 2013 Gilles Guillot (gigu@dtu.dk)
More informationCategorical and Zero Inflated Growth Models
Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationLauren Jacob May 6, Tectonics of the Northern Menderes Massif: The Simav Detachment and its relationship to three granite plutons
Lauren Jacob May 6, 2010 Tectonics of the Northern Menderes Massif: The Simav Detachment and its relationship to three granite plutons I. Introduction: Purpose: While reading through the literature regarding
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationLecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson
Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample
More informationGeographical General Regression Neural Network (GGRNN) Tool For Geographically Weighted Regression Analysis
Geographical General Regression Neural Network (GGRNN) Tool For Geographically Weighted Regression Analysis Muhammad Irfan, Aleksandra Koj, Hywel R. Thomas, Majid Sedighi Geoenvironmental Research Centre,
More information2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationData Structures & Database Queries in GIS
Data Structures & Database Queries in GIS Objective In this lab we will show you how to use ArcGIS for analysis of digital elevation models (DEM s), in relationship to Rocky Mountain bighorn sheep (Ovis
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationRegression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102
Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationStatistics: A review. Why statistics?
Statistics: A review Why statistics? What statistical concepts should we know? Why statistics? To summarize, to explore, to look for relations, to predict What kinds of data exist? Nominal, Ordinal, Interval
More informationTento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/
Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationIntroduction to mtm: An R Package for Marginalized Transition Models
Introduction to mtm: An R Package for Marginalized Transition Models Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington 1 Introduction Marginalized transition
More informationIntroduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data
Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data Bryan A. Comstock and Patrick J. Heagerty Department of Biostatistics University of Washington
More informationExperimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla.
Experimental Design and Statistical Methods Workshop LOGISTIC REGRESSION Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Logistic regression model Logit
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationNiche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016
Niche Modeling Katie Pollard & Josh Ladau Gladstone Institutes UCSF Division of Biostatistics, Institute for Human Genetics and Institute for Computational Health Science STAMPS - MBL Course Woods Hole,
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationBMI 541/699 Lecture 22
BMI 541/699 Lecture 22 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Power and sample size for t-based
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationLogistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction
More informationContrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:
Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.
More informationPASS Sample Size Software. Poisson Regression
Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis
More informationApplying cluster analysis to 2011 Census local authority data
Applying cluster analysis to 2011 Census local authority data Kitty.Lymperopoulou@manchester.ac.uk SPSS User Group Conference November, 10 2017 Outline Basic ideas of cluster analysis How to choose variables
More informationVarieties of Count Data
CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function
More informationCommunity Health Needs Assessment through Spatial Regression Modeling
Community Health Needs Assessment through Spatial Regression Modeling Glen D. Johnson, PhD CUNY School of Public Health glen.johnson@lehman.cuny.edu Objectives: Assess community needs with respect to particular
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationRegression Analysis. A statistical procedure used to find relations among a set of variables.
Regression Analysis A statistical procedure used to find relations among a set of variables. Understanding relations Mapping data enables us to examine (describe) where things occur (e.g., areas where
More informationMultilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012
Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models Jian Wang September 18, 2012 What are mixed models The simplest multilevel models are in fact mixed models:
More informationAcknowledgments xiii Preface xv. GIS Tutorial 1 Introducing GIS and health applications 1. What is GIS? 2
Acknowledgments xiii Preface xv GIS Tutorial 1 Introducing GIS and health applications 1 What is GIS? 2 Spatial data 2 Digital map infrastructure 4 Unique capabilities of GIS 5 Installing ArcView and the
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationThis report details analyses and methodologies used to examine and visualize the spatial and nonspatial
Analysis Summary: Acute Myocardial Infarction and Social Determinants of Health Acute Myocardial Infarction Study Summary March 2014 Project Summary :: Purpose This report details analyses and methodologies
More informationHierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!
Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter
More informationOutline. ArcGIS? ArcMap? I Understanding ArcMap. ArcMap GIS & GWR GEOGRAPHICALLY WEIGHTED REGRESSION. (Brief) Overview of ArcMap
GEOGRAPHICALLY WEIGHTED REGRESSION Outline GWR 3.0 Software for GWR (Brief) Overview of ArcMap Displaying GWR results in ArcMap stewart.fotheringham@nuim.ie http://ncg.nuim.ie ncg.nuim.ie/gwr/ ArcGIS?
More informationPoisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Poisson Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Poisson Regression 1 / 49 Poisson Regression 1 Introduction
More informationLattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)
Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationSurvival Analysis I (CHL5209H)
Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationExploratory Spatial Data Analysis (ESDA)
Exploratory Spatial Data Analysis (ESDA) VANGHR s method of ESDA follows a typical geospatial framework of selecting variables, exploring spatial patterns, and regression analysis. The primary software
More informationLecture 3.1 Basic Logistic LDA
y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data
More informationHow to Model Stream Temperature Using ArcMap
How to Model Stream Temperature Using ArcMap Take note: Assumption before proceeding: A temperature point file has been attributed with TauDEM variables. There are three processes described in this document.
More informationLogistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates
Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates WI-ATSA June 2-3, 2016 Overview Brief description of logistic
More informationFixed effects results...32
1 MODELS FOR CONTINUOUS OUTCOMES...7 1.1 MODELS BASED ON A SUBSET OF THE NESARC DATA...7 1.1.1 The data...7 1.1.1.1 Importing the data and defining variable types...8 1.1.1.2 Exploring the data...12 Univariate
More informationMODULE 12: Spatial Statistics in Epidemiology and Public Health Lecture 7: Slippery Slopes: Spatially Varying Associations
MODULE 12: Spatial Statistics in Epidemiology and Public Health Lecture 7: Slippery Slopes: Spatially Varying Associations Jon Wakefield and Lance Waller 1 / 53 What are we doing? Alcohol Illegal drugs
More informationMLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017
MLMED User Guide Nicholas J. Rockwood The Ohio State University rockwood.19@osu.edu Beta Version May, 2017 MLmed is a computational macro for SPSS that simplifies the fitting of multilevel mediation and
More informationExamining the extent to which hotspot analysis can support spatial predictions of crime
Examining the extent to which hotspot analysis can support spatial predictions of crime Spencer Paul Chainey Thesis submitted in accordance with the requirements of the Degree of Doctor of Philosophy University
More informationModel Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection
Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist
More informationIntroduction To Logistic Regression
Introduction To Lecture 22 April 28, 2005 Applied Regression Analysis Lecture #22-4/28/2005 Slide 1 of 28 Today s Lecture Logistic regression. Today s Lecture Lecture #22-4/28/2005 Slide 2 of 28 Background
More informationDose-Response Analysis Report
Contents Introduction... 1 Step 1 - Treatment Selection... 2 Step 2 - Data Column Selection... 2 Step 3 - Chemical Selection... 2 Step 4 - Rate Verification... 3 Step 5 - Sample Verification... 4 Step
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationChapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models
Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29 14.1 Regression Models
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationR Hints for Chapter 10
R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.
More informationModels for Binary Outcomes
Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationVarious Issues in Fitting Contingency Tables
Various Issues in Fitting Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Complete Tables with Zero Entries In contingency tables, it is possible to have zero entries in a
More informationApplication of eigenvector-based spatial filtering approach to. a multinomial logit model for land use data
Presented at the Seventh World Conference of the Spatial Econometrics Association, the Key Bridge Marriott Hotel, Washington, D.C., USA, July 10 12, 2013. Application of eigenvector-based spatial filtering
More informationTruck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation
Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical
More information1. BINARY LOGISTIC REGRESSION
1. BINARY LOGISTIC REGRESSION The Model We are modelling two-valued variable Y. Model s scheme Variable Y is the dependent variable, X, Z, W are independent variables (regressors). Typically Y values are
More information