MULTINOMIAL LOGISTIC REGRESSION
|
|
- Christian Flowers
- 5 years ago
- Views:
Transcription
1 MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic regression. Model Multinomial logistic regression is realized as many binary logistic regressions. One category is chosen as a reference category and, for each of the remaining categories, the binary logistic regression model is constructed. For example, let Y have four different values (4 categories), that is let Y = 1, 2, 3, 4. Let us assume that the reference category is Y=4 and let variables X, Z, W be regressors of the model. Then we construct three models of the binary logistic regression: P(Y = 1) P(Y = 4) = exp {C 1 + b 11 X + b 12 Z + b 13 W}, P(Y = 2) P(Y = 4) = exp {C 2 + b 21 X + b 22 Z + b 23 W}, P(Y = 3) P(Y = 4) = exp {C 3 + b 31 X + b 32 Z + b 33 W}. Coefficients C 1, b 11, b 12, b 13, C 2, b 21, b 22, b 23, C 3, b 31, b 32, b 33 are unknown. Their estimates C 1, b 11, b 12, b 13 are obtained from data. Let us assume that Z and W are fixed, then If b 11 > 0, then as X grows, it becomes more probable that Y = 1 than Y = 4. If b 11 < 0, then as X grows, it becomes more probable that Y = 4 than Y = 1. 1
2 Similarly, If b 11 b 21 > 0 then as X grows, it becomes more probable that Y = 1 than Y = 2. If b 11 b 21 < 0, then as X grows, it becomes more probable that Y = 2 than Y = 1. Forecasting For concrete values of X,Z, W we calculate z 1 = C 1 + b 11 X + b 12 Z + b 13 W, z 2 = C 2 + b 21 X + b 22 Z + b 23 W, z 3 = C 3 + b 31 X + b 32 Z + b 33 W And e z 1 P (Y = 1) = 1 + e z 1 + e z 2 + e z, P (Y = 2) = e z 1 + e z 2 + e z, 3 e z 3 P (Y = 3) = 1 + e z 1 + e z 2 + e z, P (Y 1 = 4) = e z 1 + e z 2 + e z. 3 e z 2 Here e = 2, We forecast value with the largest probability. Odds ratio For each category we can calculate odds ratio that is how much odds change if regressor grows by additional point: Number e b 11 W are fixed): is odds ratio for P (Y=1)/ P (Y=3) when X grows by one point (Z and ( P (Y = 1) P (Y = 3) ) new = e b 11 ( P (Y = 1) P (Y = 3) ) old 2
3 Data Dependent variable Y is categorical, regressors are interval or categorical variables. Interval regressors do not correlate strongly. For each category of Y we have sufficient amount of data. Main steps 1) Check for the maximum likelihood statistics p-value. Check if p < 0,05. If not, then the model is unacceptable. 2) Check if all regressors are statistically significant (all p < 0,05). Drop insignificant regressors from the model. 3) Check classification table. If there are many false classifications, the model is unacceptable. 4) Additionally one can check if for the deviance test s p 0,05 (for models with few interval regressors) For a good model: Maximum likelihood statistic s p < 0,05. Maximum likelihood p < 0,05 for all regressors. Wold s p < 0,05 for the majority of regressors in all sub models. Percent of correctly classified observations for each category is larger than a percent of data belonging to that category. For all data Cook s distance 1. The chosen pseudo R square (coefficient of determination) 0,20. 3
4 Multinomial logistic regression with SPSS 1. Data File ESS4CZ_IL_SE, variables: cntry (CZ Czech Republic, IL Israel, SE Sweden), stfedu satisfaction with countries educational system from 0 very bad to 10 perfect; imsclbn when immigrants can use social benefits: 1 at once, 2 after one year, 3 after one year of working and paying taxes, 4 after becoming citizens, 5 never); imigrantbf which is 0, for imsclbn <=3, and 1 for imsclbn >=4. trstprl trust in parliament ( 0 very low,... 9 very high); pray praying ( 1 every day,..., 7 never); hhmmb number of the remaining members of the household. We want to check if regressors can help to distinguish among different countries. We investigate males years of age. Let Select Cases -> : agea <= 30 & agea >= 20 & gndr = SPSS options Schematically our model is cntry = f (imigrantbf, pray, stfedu, trstprl, hhmmb) Analyze ->s Regression Multinomial Logistic. 4
5 Put cntry into Dependent. Interval variables pray, stfedu, trstprl, hhmmb go into Covariates. Categorical variable imigrantbf goes to Factor(s). Reference Category can be used if we want other reference category. Statistics -> additionally check Goodness-of-fit and Classification table. 5
6 3. Results Table Case Processing Summary gives information about the number of respondents in each category. There should not be one dominant category. Case Processing Summary N Marginal Percentage cntry Country CZ Czech Republic % IL Israel % SE Sweden % imigrantbf Social benefits for.00 positive attitude % immigrants 1.00 negative attitude % Valid % Missing 71 Total 592 Subpopulation 461 a Table Case Processing Summary is also used for comparison with the classification table. If Czechs comprise 30.3% of all respondents, then applying the multinomial logistic procedure we should classify correctly larger percent of Czechs. Classification table is at the very end of the output: Classification Predicted Observed CZ Czech Republic IL Israel SE Sweden Percent Correct CZ Czech Republic % IL Israel % SE Sweden % Overall Percentage 30.9% 42.0% 27.1% 73.7% % Swedes. We see that our model helps to classify correctly 70,3 % of Czechs, 80,1 % Israelites and 67,6 Table Pseudo R-Square contains pseudo R-Squares. We can choose any-one of them and be happy that it is larger than For example, Nagelkerke R-Square =
7 Pseudo R-Square Cox and Snell.575 Nagelkerke.650 McFadden.397 Maximum likelihood statistic and it s p-value for the whole model is given in Model Fitting Information. Since p<0.05 we conclude that there is statistically significant overall model fit to data. Model Fitting Information Model Fitting Criteria Likelihood Ratio Tests Model -2 Log Likelihood Chi-Square df Sig. Intercept Only Final Statistical significance of the regressors for the whole model can be found in the table Likelihood Ratio Tests. If p<0.05, then regressor is statistically significant. If p>=0.05, then regressor is not significant and should be dropped from the model and all analysis should be repeated. In our case all regressors are statistically significant. Likelihood Ratio Tests Effect Model Fitting Criteria -2 Log Likelihood of Reduced Model Chi- Likelihood Ratio Tests Square df Sig. Intercept pray stfedu trstprl hhmmb imigrantbf Table Parameter Estimates gives information about all sub models. Observe that in some sub models various regressors are not significant. For example, pray is significant when distinguishing between Israel and Czech Republic and insignificant when distinguishing between Czech Republic and Sweden.. 7
8 Parameter Estimates 95 % Confidence Interval for Exp(B) Std. Lower Upper cntry Country a B Error Wald df Sig. Exp(B) Bound Bound CZ Czech Republic Intercept pray stfedu trstprl hhmmb [imigrantbf=.00] [imigrantbf=1.00] 0 b IL Israel Intercept pray stfedu trstprl hhmmb [imigrantbf=.00] [imigrantbf=1.00] 0 b a. The reference category is: SE Sweden. b. This parameter is set to zero because it is redundant. Each sub model has its own model equation. For example, Here P (cntry = CZ) P (cntry = SE) = ez 1. z 1 = 0, ,011pray + 0,299stfedu 0,487trstprl + 0,216hhmmb 0, jei imigrantbf = 1, + { 1,212, jei imigrantbf = 0. Positive coefficients to stfedu and hhmmb mean that respondent, who is more satisfied with countries educational institutions and lives in larger families, is more likely from Czech Republic than from Sweden. Similarly, respondent who is more satisfied with parliament is more likely from Sweden. Finally, if respondent has more positive attitude toward social benefits for immigrants (imigrantbf = 0), it is more probable that he is from Sweden. Similarly, 8
9 P (cntry = IL) P (cntry = SE) = ez 2, Here z 2 = 5,077 0,573pray 0,379stfedu 0,25trstprl + 0,704hhmmb 0, jei imigrantbf = 1, + { 0,821, jei imigrantbf = 0. All interpretations of the coefficient signs are similar to those in above. 4. Forecasting Let pray = 2, stfedu = 4, trstprl = 5, hhmb = 4, imigrantbf = 1. Then And And P (cntry = SE) = z 1 = 0, , , , , = 0,149, z 2 = 5,077 0, , , , = 3,981. Most likely this respondent is from Israel. e z 1 = 1,160, e z 2 = 53,570, e z 1 + e z 2 = 54, ,73 = 0,01794, P (cntry = CZ) = 1, ,73 = 0,0208 P (cntry = IL) = 53, ,73 = 0, Interaction of variables If we suspect interaction of variables, we can add products of variables into model. For example, if we think that cntry = f (imigrantbf, pray, stfedu, trstprl, hhmmb, stfedu*trstprl) then we, in addition, choose Model -> check Custom Stepwise -> and put all variables into Forced Entry Terms. Product of stfedu and trstprl appears if we put both variables at once. 9
10 In the output we see that this interaction is not statistically significant. Likelihood Ratio Tests Model Fitting Criteria Likelihood Ratio Tests Effect -2 Log Likelihood of Reduced Model Chi-Square df Sig. Intercept a hhmmb imigrantbf pray stfedu trstprl stfedu * trstprl
1. BINARY LOGISTIC REGRESSION
1. BINARY LOGISTIC REGRESSION The Model We are modelling two-valued variable Y. Model s scheme Variable Y is the dependent variable, X, Z, W are independent variables (regressors). Typically Y values are
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationEDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.
EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight
More informationChapter 19: Logistic regression
Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog
More informationLogistic Regression. Continued Psy 524 Ainsworth
Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression
More information2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
More informationDependent Variable Q83: Attended meetings of your town or city council (0=no, 1=yes)
Logistic Regression Kristi Andrasik COM 731 Spring 2017. MODEL all data drawn from the 2006 National Community Survey (class data set) BLOCK 1 (Stepwise) Lifestyle Values Q7: Value work Q8: Value friends
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More information7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).
1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that
More informationChapter 19: Logistic regression
Chapter 19: Logistic regression Smart Alex s Solutions Task 1 A display rule refers to displaying an appropriate emotion in a given situation. For example, if you receive a Christmas present that you don
More informationSOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.
SOS3003 Applied data analysis for social science Lecture note 08-00 Erling Berge Department of sociology and political science NTNU Erling Berge 00 Literature Logistic regression II Hamilton Ch 7 p7-4
More informationBinary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationModels for Binary Outcomes
Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.
More informationChapter 15 - Multiple Regression
15.1 Predicting Quality of Life: Chapter 15 - Multiple Regression a. All other variables held constant, a difference of +1 degree in Temperature is associated with a difference of.01 in perceived Quality
More informationProcedia - Social and Behavioral Sciences 109 ( 2014 )
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 09 ( 04 ) 730 736 nd World Conference On Business, Economics And Management - WCBEM 03 Categorical Principal
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationDiscriminant Analysis
Discriminant Analysis V.Čekanavičius, G.Murauskas 1 Discriminant analysis one categorical variable depends on one or more normaly distributed variables. Can be used for forecasting. V.Čekanavičius, G.Murauskas
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMass Fare Adjustment Applied Big Data. Mark Langmead. Director Compass Operations, TransLink Vancouver, British Columbia
Mass Fare Adjustment Applied Big Data Mark Langmead Director Compass Operations, TransLink Vancouver, British Columbia Vancouver British Columbia Transit Fare Structure Customer Satisfaction Correct fare
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationLOGISTICS REGRESSION FOR SAMPLE SURVEYS
4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationx3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators
Multiple Regression Relating a response (dependent, input) y to a set of explanatory (independent, output, predictor) variables x, x 2, x 3,, x q. A technique for modeling the relationship between variables.
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)
36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)
More informationRonald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012
Ronald Heck Week 14 1 From Single Level to Multilevel Categorical Models This week we develop a two-level model to examine the event probability for an ordinal response variable with three categories (persist
More informationssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm
Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationDwelling Price Ranking vs. Socio-Economic Ranking: Possibility of Imputation
Dwelling Price Ranking vs. Socio-Economic Ranking: Possibility of Imputation Larisa Fleishman Yury Gubman Aviad Tur-Sinai Israeli Central Bureau of Statistics The main goals 1. To examine if dwelling prices
More informationTrends in Human Development Index of European Union
Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationA course in statistical modelling. session 09: Modelling count variables
A Course in Statistical Modelling SEED PGR methodology training December 08, 2015: 12 2pm session 09: Modelling count variables Graeme.Hutcheson@manchester.ac.uk blackboard: RSCH80000 SEED PGR Research
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationi PASW Regression 18
i PASW Regression 18 For more information about SPSS Inc. software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago, IL 60606-6412
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More information2. We care about proportion for categorical variable, but average for numerical one.
Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Problem & Data Overview Primary Research Questions: 1. What are the risk factors associated with CHD? Regression Questions: 1. What is Y? 2. What is X? Did player develop
More informationUnivariate Analysis of Variance
Univariate Analysis of Variance Output Created Comments Input Missing Value Handling Syntax Resources Notes Data Active Dataset Filter Weight Split File N of Rows in Working Data File Definition of Missing
More informationArticle from. Predictive Analytics and Futurism. July 2016 Issue 13
Article from Predictive Analytics and Futurism July 2016 Issue 13 Regression and Classification: A Deeper Look By Jeff Heaton Classification and regression are the two most common forms of models fitted
More informationONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION
ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION Ernest S. Shtatland, Ken Kleinman, Emily M. Cain Harvard Medical School, Harvard Pilgrim Health Care, Boston, MA ABSTRACT In logistic regression,
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationPROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY
Paper SD174 PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY ABSTRACT Keywords: Logistic. INTRODUCTION This paper covers some gotchas in SAS R PROC LOGISTIC.
More informationGeneralized logit models for nominal multinomial responses. Local odds ratios
Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More information22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression
22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then
More informationThree Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology
Data_Analysis.calm Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology This article considers a three factor completely
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More information(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)
12. Comparing Groups: Analysis of Variance (ANOVA) Methods Response y Explanatory x var s Method Categorical Categorical Contingency tables (Ch. 8) (chi-squared, etc.) Quantitative Quantitative Regression
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationConfidence Intervals for the Odds Ratio in Logistic Regression with One Binary X
Chapter 864 Confidence Intervals for the Odds Ratio in Logistic Regression with One Binary X Introduction Logistic regression expresses the relationship between a binary response variable and one or more
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationUsing the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.
Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 This handout steals heavily
More informationA course in statistical modelling. session 06b: Modelling count data
A Course in Statistical Modelling University of Glasgow 29 and 30 January, 2015 session 06b: Modelling count data Graeme Hutcheson 1 Luiz Moutinho 2 1 Manchester Institute of Education Manchester university
More informationLogistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy
Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided
More informationSingle-level Models for Binary Responses
Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationQ30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only
Moyale Observed counts 12:28 Thursday, December 01, 2011 1 The FREQ Procedure Table 1 of by Controlling for site=moyale Row Pct Improved (1+2) Same () Worsened (4+5) Group only 16 51.61 1.2 14 45.16 1
More informationLOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi
LOGISTIC REGRESSION Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi- lmbhar@gmail.com. Introduction Regression analysis is a method for investigating functional relationships
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationMore Accurately Analyze Complex Relationships
SPSS Advanced Statistics 17.0 Specifications More Accurately Analyze Complex Relationships Make your analysis more accurate and reach more dependable conclusions with statistics designed to fit the inherent
More informationTurning a research question into a statistical question.
Turning a research question into a statistical question. IGINAL QUESTION: Concept Concept Concept ABOUT ONE CONCEPT ABOUT RELATIONSHIPS BETWEEN CONCEPTS TYPE OF QUESTION: DESCRIBE what s going on? DECIDE
More informationHierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!
Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter
More informationModeling Land Use Change Using an Eigenvector Spatial Filtering Model Specification for Discrete Response
Modeling Land Use Change Using an Eigenvector Spatial Filtering Model Specification for Discrete Response Parmanand Sinha The University of Tennessee, Knoxville 304 Burchfiel Geography Building 1000 Phillip
More informationLogistic Regression Analysis
Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well
More informationChapter 15 - Multiple Regression
15.1 Predicting Quality of Life: Chapter 15 - Multiple Regression a. All other variables held constant, a difference of +1 degree in Temperature is associated with a difference of.01 in perceived Quality
More informationInteractions between Binary & Quantitative Predictors
Interactions between Binary & Quantitative Predictors The purpose of the study was to examine the possible joint effects of the difficulty of the practice task and the amount of practice, upon the performance
More informationLab 8. Matched Case Control Studies
Lab 8 Matched Case Control Studies Control of Confounding Technique for the control of confounding: At the design stage: Matching During the analysis of the results: Post-stratification analysis Advantage
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationFrequency Distribution Cross-Tabulation
Frequency Distribution Cross-Tabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More informationMcGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination
McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please
More informationNeural networks (not in book)
(not in book) Another approach to classification is neural networks. were developed in the 1980s as a way to model how learning occurs in the brain. There was therefore wide interest in neural networks
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationFrom the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models
The Stata Journal (2002) 2, Number 3, pp. 301 313 From the help desk: Comparing areas under receiver operating characteristic curves from two or more probit or logit models Mario A. Cleves, Ph.D. Department
More informationSystematic error, of course, can produce either an upward or downward bias.
Brief Overview of LISREL & Related Programs & Techniques (Optional) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 STRUCTURAL AND MEASUREMENT MODELS:
More informationRegression of Inflation on Percent M3 Change
ECON 497 Final Exam Page of ECON 497: Economic Research and Forecasting Name: Spring 2006 Bellas Final Exam Return this exam to me by midnight on Thursday, April 27. It may be e-mailed to me. It may be
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationModel Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)
Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationOutline. The binary choice model. The multinomial choice model. Extensions of the basic choice model
Outline The binary choice model Illustration Specification of the binary choice model Interpreting the results of binary choice models ME output The multinomial choice model Illustration Specification
More informationA COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS
A COEFFICIENT OF DETEMINATION FO LOGISTIC EGESSION MODELS ENATO MICELI UNIVESITY OF TOINO After a brief presentation of the main extensions of the classical coefficient of determination ( ), a new index
More informationConfidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s
Chapter 866 Confidence Intervals for the Odds Ratio in Logistic Regression with Two Binary X s Introduction Logistic regression expresses the relationship between a binary response variable and one or
More information