SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.
|
|
- Gavin Ryan
- 5 years ago
- Views:
Transcription
1 SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU Erling Berge 00 Literature Logistic regression II Hamilton Ch 7 p7-4 Erling Berge 00 Erling Berge 00
2 Definitions I The probability that person no i shall have the value on the variable Y i will be written Pr(Y i =). Then Pr(Y i ) = - Pr(Y i =) The odds that person no i shall have the value on the variable Y i, here called O i, is the ratio between two probabilities O i ( y ) i ( yi = ) ( y ) Pr pi = = = Pr = p Erling Berge 00 3 i i Definitions II The LOGIT, L i, for person no i (corresponding to Pr(Y i =)) is the natural logarithm of the odds, O i, that person no i has the value on variable Y i. This is written: L i = ln(o i ) = ln{p i /(-p i )} The model assumes that L i is a linear function of the explanatory variables x ji, i.e.: L i = β 0 + Σ j β j x ji, where j=,..,k-, and i=,..,n Erling Berge 00 4 Erling Berge 00
3 Logistic regression: assumptions The model is correctly specified The logit is linear in its parameters All relevant variables are included No irrelevant variables are included x-variables are measured without error Observations are independent No perfect multicollinearity No perfect discrimination Sufficiently large sample Erling Berge 00 5 Assumptions that cannot be tested Model specification All relevant variables are included x-variables are measured without error Observations are independent Two will be tested automatically. If the model can be estimated there is No perfect multicollinearity and No perfect discrimination Erling Berge 00 6 Erling Berge 00 3
4 LOGISTIC REGRESSION Statistical problems may be due to Too small a sample High degree of multicollinearity Leading to large standard errors (imprecise estimates) Multicollinearity is discovered and treated in the same way as in OLS regression High degree of discrimination (or separation) Leading to large standard errors (imprecise estimates) Will be discovered automatically by SPSS Erling Berge 00 7 Assumptions that can be tested Model specification logit is linear in the parameters no irrelevant variables are included Sufficiently large sample What constitutes a sufficiently large sample is not always clear. It depends on how the cases are distributed between 0 and categories. If one of these is too small there will be problems estimating partial effects. It also depends on the number of different patterns in the sample and how cases are distributed across these Erling Berge 00 8 Erling Berge 00 4
5 LOGISTIC REGRESSION: TESTING () Testing implies an assessment of whether statistical problems leads to departure from the assumptions Two tests are useful () The Likelihood ratio test statistic: χ Η Can be used analogous to the F-test () Wald test The square root of this can be used analogous to the t-test Erling Berge 00 9 LOGISTIC REGRESSION: TESTING () The Likelihood Ratio test : A ratio between two Likelihoods is equivalent to a difference between two LogLikelihoods The difference between the LogLikelihood (LL) of two nested models, estimated on the same data, can be used to test which of two models fits the data best, just like the F-statistic is used in OLS regression The test can also be used for singe regression coefficients (single variables). In small samples it has better properties than the Wald statistic Erling Berge 00 0 Erling Berge 00 5
6 LOGISTIC REGRESSION: TESTING (3) The Likelihood Ratio test statistic χ Η = -[LL(model) - LL(model)] will, if the null hypothesis of no difference between the two models is correct, be distributed approximately (for large n) as the chi-square distribution with number of degrees of freedom equal to the difference in number of parameters in the two models (H) Erling Berge 00 Example of a Likelihood Ratio test Model : just constant Model : constant plus one variable χ Η = -[LL(model) - LL(model)] = -LL(model) + LL(model) Find the value of the ChiSquare and the number of degrees of freedom e.g.: LogLikelihood (mod) = 09,/(-) LogLikelihood (mod) = 95,67/(-) From Tab 7.: - Log- Likelihood 09, 95,684 95,69 95,67 95,67 Erling Berge 00 Erling Berge 00 6
7 Logistic Regression in SPSS I Case Processing Summary Unweighted Cases a Selected Cases Included in Analysis Unselected Cases Total Missing Cases Total N Percent a. If weight is in effect, see classification table for the total number of cases. Dependent Variable Encoding Original Value OPEN CLOSE Internal Value 0 Erling Berge 00 3 Logistic Regression in SPSS IIa Iteration History a,b,c Iteration Step Log Coefficients likelihood Constant a. Constant is included in the model. b. Initial - Log Likelihood: 09. c. Estimation terminated at iteration number 3 because parameter estimates changed by less than.00. Erling Berge 00 4 Erling Berge 00 7
8 Logistic Regression in SPSS IIb Classification Table a,b Observed Step 0 SCHOOLS SHOULD CLOSE Overall Percentage OPEN CLOSE a. Constant is included in the model. b. The cut value is.500 Predicted SCHOOLS SHOULD CLOSE Percentage OPEN CLOSE Correct Step 0 Constant Variables in the Equation B S.E. Wald df Sig. Exp(B) Variables not in the Equation Step 0 Variables Overall Statistics lived Score df Sig Erling Berge 00 5 Logistic Regression in SPSS IIIa Iteration History a,b,c,d Iteration Step 3 4 a. Method: Enter - Log Coefficients likelihood Constant lived b. Constant is included in the model. c. Initial - Log Likelihood: d. Estimation terminated at iteration number 4 because parameter estimates changed by less than.00. Erling Berge 00 6 Erling Berge 00 8
9 Logistic Regression in SPSS IIIb Omnibus Tests of Model Coefficients Step Step Block Model Chi-square df Sig Step Model Summary - Log Cox & Snell Nagelkerke likelihood R Square R Square a a. Estimation terminated at iteration number 4 because parameter estimates changed by less than.00. Erling Berge 00 7 Logistic Regression in SPSS IIIc Classification Table a Step Observed SCHOOLS SHOULD CLOSE Overall Percentage a. The cut value is.500 OPEN CLOSE Predicted SCHOOLS SHOULD CLOSE Percentage OPEN CLOSE Correct Step a lived Constant a. Variable(s) entered on step : lived. Variables in the Equation B S.E. Wald df Sig. Exp(B) Erling Berge 00 8 Erling Berge 00 9
10 LOGISTIC REGRESSION: TESTING (4) The Wald test The Wald (or chisquare) test statistic provided by SPSS = t = (b k / SE(b k )) (where t is the t used by Hamilton) can be used for testing single parameters similarly to the t-statistic of the OLS regression Check out Hamilton table 7.: find t, see Wald above If the null hypothesis is correct, t will (for large n) in logistic regression be approximately normally distributed If the null hypothesis is correct, the Wald statistic will (for large n) in logistic regression be approximately chisquare distributed with df= Erling Berge 00 9 Excerpt from Hamilton Table 7. Iterasjon Log likelihood 09, 5,534 49,466 49,38 49,38 49,38 Variables B S.E. Wald df Sig. Exp(B) Lived -,046,05 9,698,00,955 Educ -,66,090 3,404,065,847 Contam,08,465 6,739,009 3,347 Hsc,73,464,99,000 8,784 Constant,73,30,768,84 5,649 Erling Berge 00 0 Erling Berge 00 0
11 Confidence interval for parameter estimates Can be constructed based on the fact that the square root of the Wald statistic approximately follows a normal distribution with degree of freedom b k -t α *SE(b k ) < β k < b k + t α *SE(b k ) where t α is a value taken from the table of the normal distribution with level of significance equal to α Erling Berge 00 Can be constructed based on the t- distribution () If a table of the normal distribution is missing one may use the t-distribution since the t- distribution is approximately normally distributed for large n-k (e.g. for n-k > 0) Erling Berge 00 Erling Berge 00
12 Excerpt from Hamilton Table 7.3 (from SPSS) STATA SPSS Step lived B -,047 S.E.,07 t Wald 7,550 df Prob>t Sig.,006 Exp(B),954 educ -,06,093 4,887,07,84 contam,8,48 7,094,008 3,604 hsc,48,50,508,000,3 female -,05,557,009,96,950 kids -,67,566,406,36,5 nodad -,6,999 4,964,06,08 Constant,894,603 3,59,07 8,060 Erling Berge 00 3 More from Hamilton Table 7.3 Iteration - LogLike lihood Coefficients Const lived educ contam hsc female kids nodad Step0 09, -0,76 Step 47,08,565 -,07 -,30,78,764 -,05 -,365 -,074 4,48,538 -,04 -,87,47,39 -,037 -,580 -, ,054,859 -,046 -,04,69,40 -,050 -,66 -,84 4 4,049,893 -,047 -,06,8,48 -,05 -,67 -,5 5 4,049,894 -,047 -,06,8,48 -,05 -,67 -,6 Erling Berge 00 4 Erling Berge 00
13 Is the model in table 7.3 better than the model in table 7.? LL(model in 7.3) = 4,049/(-) LL(model in 7.) = 49,38/(-) χ Η = -[LL(model 7.) - LL(model 7.3)] Find χ Η value Find H Look up the table of the chisquare distribution Erling Berge 00 5 The model of the probability of observing y= for person i exp( Li ) Pr( yi = ) = E[ yi x] = = + exp + exp( L) K where the logit L =β + β ( L ) i 0 j ji j= of the explanatory variables X i is a linear function It is not easy to interpret the meaning of the β coefficients just based on this formula i Erling Berge 00 6 Erling Berge 00 3
14 The odds ratio The odds ratio, O, can be interpreted as the relative effect of having one variable value rather than another e.g. if x ki = t+ in L i and x ki = t in L i O = O i (Y i = L i )/ O i (Y i = L i ) = exp[l i ]/ exp[l i ] = exp[β k ] Why β k? Erling Berge 00 7 The odds ratio : example I The Odds for answering yes = e b 0+b *Alder+b *Kvinne+b 3 *E.utd+b 4 *Barn i HH The odds ratio for answering yes between women and men = e e b0+ b* Alder+ b* + b3* E. utd + b4* Barn _ i _ HH = b + b * Alder+ b *0 + b * E. utd + b * Barn _ i _ HH e b Remember the rules of power exponents Erling Berge 00 8 Erling Berge 00 4
15 Conditional Effect Plot Set all x-variables except x k to fixed values and enter these into the equation for the logit Plot Pr(Y=) as a function of x k i.e. P =/(+exp[-l]) = /(+exp[-konst - b k x k ]) for all reasonable values of x k, konst is the constant obtained by entering into the logit the fixed values of variables other than x k Erling Berge 00 9 Excerpt from Hamilton Table 7.4 B S.E. Wald df Sig. Exp(B) Minimu m Maximu m Mean lived -,040,05 6,559,00,96,00 8,00 9,680 educ -,97,093 4,509,034,8 6,00 0,00,954 contam,99,477 7,43,006 3,664,00,00,80 hsc,79,490,59,000 9,763,00,00,307 nodad -,73,75 5,696,07,77,00,00,699 Constant,8,330,69,0 8,866 Logit: L = *lived -0.97*educ +.99*contam +.79*hsc -.73*nodad Here we let lived vary and set in reasonable values for other variables Erling Berge Erling Berge 00 5
16 Conditional effect plot from Hamilton table 7.4 (fig7.5): effect of living for a long time in town y=/(+exp(-( x ))) y=/(+exp(-( x ))) y=/(+exp(-( x ))) Mean Max Min Erling Berge 00 3 Conditional effect plot from Hamilton table 7.4 (fig7.6): effect of pollution on own land y=/(+exp(-( x ))) y=/(+exp(-( x ))) y=/(+exp(-( x ))) Mean Max Min Erling Berge 00 3 Erling Berge 00 6
17 Coefficients of determination Logistic regression does not provide measures comparable to the coefficient of determination in OLS regression Several measures analogous to R have been proposed They are often called pseudo R Hamilton uses Aldrich and Nelson s pseudo R = χ /(χ +n) where χ = test statistic for the test of the whole model against a model with just a constant and n= the number of cases Erling Berge Some pseudo R in SPSS SPSS reports Cox and Snell, Nagelkerke, and in multinomial logistic regression also McFadden s proposal for R Aldrich and Nelson s pseudo R can easily be computed by ourselves [pseudo R = χ /(χ +n)] Model Summary Step - Log likelihood *** Cox & Snell R Square *** Nagelkerke R Square *** Pseudo R-Square Cox and Snell Nagelkerke McFadden *** *** *** Erling Berge Erling Berge 00 7
18 Statistical problem: linearity of the logit Curvilinearity of the logit can give biased parameter estimates Scatter plot for y - x is not informative since y only has values To test if the logit is linear in an x-variable one may do as follows Group the x variable For every group find average of y and compute the logit for this value Make a graph of the logits against the grouped x Erling Berge Y= Closing school vs. x= Years lived in town,00 0,80 SCHOOLS SHOULD CLOSE 0,60 0,40 0,0 Scatter plot is not very informative 0,00 0,00 0,00 40,00 60,00 80,00 00,00 YEARS LIVED IN WILLIAMSTOWN Erling Berge Erling Berge 00 8
19 Linearity in logit: example Recall: Logit = L i = ln(o i ) = ln{p i /(-p i )} SCHOOLS SHOULD CLOSE YEARS LIVED IN WILLIAMSTOWN (Banded) <= N OPEN N Within group CLOSE Mean (=p) 3,65 4,50 0,59 7,44 8,4,3,3 Logit Ln(p/(-p)) 0,69 0 0,364-0,4-0,33 -,90 -,90 Erling Berge Is the logit linear in years lived in town? Chart ln(b/(-b)) Maybe! GroupedLived GroupedLived Erling Berge Erling Berge 00 9
20 In case of curvilinearity the odds ratio is non-constant Assume the logit is curvilinear in education. Then the odds ratio for answering yes, adding one year of education, is: e e ( ) ( ) b + b * Alder+ b * Kvinne+ b * E. utd + + b * E. utd + 0 a k utd utd e b + b * Alder+ b * Kvinne+ b *. E utd + b *. E utd 0 a k utd utd ( ) ( ) e = = e 0 *. Eutd e b + b * Eutd. + Eutd. + b + b * Eutd. + utd utd utd utd e b utd = utd ( ) b + b * Eutd. + utd Erling Berge Statistical problems: influence Influence from outliers and unusual x- values are just as problematic in logistic regression as in OLS regression Transformation of x-variables to symmetry will minimize the influence of extreme variable values Large residuals are indicators of large influence Erling Berge Erling Berge 00 0
21 Influence: residuals There are several ways to standardize residuals Pearson residuals Deviance residuals Influence can be based on Pearson residual Deviance residual Leverage (potential for influence): i.e. the statistic h j Erling Berge 00 4 Diagnostic graphs Outlier plots can be based on plots of estimated probability of Y i = (estimated P i ) against Delta* B, Δ B j, or Delta* Pearson Chisquare, Δχ P(j), or Delta* Deviance Chisquare, Δχ D(j) * Delta can be translated as change in Erling Berge 00 4 Erling Berge 00
22 SPSS output Cook's = delta B in Hamilton The logistic regression analogue of Cook's influence statistic. A measure of how much the residuals of all cases would change if a particular case were excluded from the calculation of the regression coefficients. Leverage Value = h in Hamilton The relative influence of each observation on the model's fit. DfBeta(s) is not used by Hamilton in logistic regression The difference in beta value is the change in the regression coefficient that results from the exclusion of a particular case. A value is computed for each term in the model, including the constant. Erling Berge ,70000 Delta B 0,60000 Analog of Cook's influence statistics 0, , , ,0000 0,0000 0, , ,0000 0, , ,80000,00000 Predicted probability Erling Berge Erling Berge 00
23 SPSS output from Save () Unstandardized Residuals The difference between an observed value and the value predicted by the model. Logit Residual ei e ; ˆ i = whereei = yi π i ˆ π ( ˆ π ) i i π i is the probability that y i = ; the hat means estimated value Erling Berge SPSS output from Save () Standardized = Pearson residual The command standardized will make SPSS write a variable called ZRE_ and labelled Normalized residual This is the same as the Pearson residual in Hamilton Studentized = [SQRT(delta deviance chisquare)] The command Studentized will make SPSS write a variable called SRE_ and labelled Standardized residual This is the same as the square root of delta Deviance chisquare in Hamilton, i.e. delta Deviance chisquare = (SRE_) Deviance = Deviance residual The command Deviance will make SPSS write a variable called DEV_ and labelled Deviance value This is the same as the deviance residual in Hamilton Erling Berge Erling Berge 00 3
24 Computation of Δχ P(i) Based on the quantities provided by SPSS we can compute delta Pearson chisquare Where it says r j in the formula we put in ZRE_ and where it says h j we put in LEV_ Δ χ = P( j) r j ( hj ) Erling Berge Computation of Δχ D(i) Based on the quantities provided by SPSS we can compute Delta Deviance Chisquare. To find delta deviance chisquare we square SRE_. Alternatively we put in d j =DEV_ and h j =LEV_ in the formula Δ χ D ( j ) = SRE _* SRE _ Δ χ = D ( j) ( h j ) Erling Berge d j Erling Berge 00 4
25 DeltaDevianceChisquare (with/caseno) 7,00 96,00 6,00 99,00 5,00 DeltaAvviksKjiKv 4,00 3,00 3,00 65,00,00,00 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge DeltaDevianceChisquare (with/delta B) 7,00,44 6,00,6366 5,00 DeltaAvviksKjiKv 4,00 3,00,08437,0584,697,33699,00,00 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge Erling Berge 00 5
26 Delta Pearson Chisquare (with/caseno) 30,00 96,00 5,00 DeltaPearsonKjiKv 0,00 5,00 0,00 99,00 65,00 5,00 3,00 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge 00 5 Delta Pearson Chisquare (with/ delta B) 30,00,44 5,00 DeltaPearsonKjiKv 0,00 5,00 0,00,6366, ,00,08437,0584,697 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge 00 5 Erling Berge 00 6
27 Cases with large influence Variables Case No 96 Case No 65 Case No 99 Variables Case No 96 Case No 65 Case No 99 Y=close,00,00,00 ZRE_ 4, -,48-5,36 lived 68,00 40,00,00 DEV_,4 -,98 -,6 educ,00,00,00 DFB0_ -,3,0 -,36 contam,00,00,00 DFB_,0,00,00 hsc,00,00,00 DFB_,0,0,0 nodad,00,00,00 DFB3_ -,08 -,5 -,8 PRE_,05,86,97 DFB4_ -,06 -,7 -,9 COO_,64,34,4 DFB5_ -,08,6,4 RES_,95 -,86 -,97 DeltaPearsonKjiKv 8,34 6,47 9,0 SRE_,46 -,04 -,6 DeltaAvviksKjiKv 6,07 4,4 6,89 Erling Berge From Cases to Patterns The figures shown previously are not identical to those you see in Hamilton Hamilton has corrected for the effect of identical patterns Erling Berge Erling Berge 00 7
28 Influence from a shared pattern of x-variables In a logistic regression with few variables many cases will have the same value on all x-variables. Every combination of x-variable values is called a pattern When many cases have the same pattern, every case may have a small influence, but collectively they may have unusually large influence on parameter estimates Influential patterns in x-values can give biased parameter estimates Erling Berge Influence: Patterns in x-values Predicted value, and hence the residual will be the same for all cases with the same pattern Influence from pattern j can be found by means of The frequency of the pattern Pearson residual Deviance residual Leverage: i.e. the statistic h j Erling Berge Erling Berge 00 8
29 Finding X-pattern by means of SPSS In the Data menu find the command Identify duplicate cases Mark the x-variables that are used in the model and move them to Define matching cases by Cross for Sequential count of matching cases in each group and Display frequencies for created variables This produces two new variables. One, MatchSequence, numbers cases sequentially,, where several patterns are identical. If the pattern is unique this variable has the value 0. The other variable, Primary, has the value 0 for duplicates and for unique patterns Erling Berge X-patterns in SPSS; Hamilton p38-4 Frequency Percent Valid Percent Cumulative Percent Duplicate Case 3,7 3,7 3,7 Primary Case Total ,3 00,0 86,3 00,0 00,0 Sequential count of matching cases 0 [5 patterns with case] [7 patterns with or 3 cases] [7 4=3 patterns with cases] 3 [4 patterns with 3 cases] Total Frequency Percent 75,,,,6 00,0 Valid Percent 75,,,,6 00,0 Cumulative Percent 75, 86,3 97,4 00,0 Erling Berge Erling Berge 00 9
30 J m j Pˆj Y j r j χ P d j χ D h i h j Hamilton table 7.6 Symbols # unique patterns of x-values in the data (J<=n) # cases with the pattern j (m>=) Predicted probability of Y= for case with pattern j Sum of y-values for cases with pattern j (= # cases with pattern j and y=) Pearson residual for pattern j Pearson Chisquare statistic Deviance residual for pattern j Deviance Chisquare statistic Leverage for case i Leverage for pattern j Erling Berge New values for Δχ P(i) and Δχ D(i) By Compute one may calculate the Pearson residual (equation 7.9 in Hamilton) and delta Pearson chisquare (equation 7.4 in Hamilton) once more. This will provide the correct values The same applies for deviance residual (equation 7.) and delta deviance chisquare (equation 7.5a) Erling Berge Erling Berge 00 30
31 Leverage and residuals () Leverage of a pattern is obtained as number of cases with the pattern times the leverage of a case with this pattern. The leverage of a case is the same as in OLS regression h j = m j *h i Pearson residual can be found from r j = Y mpˆ m Pˆ j j j ( Pˆ ) j j j Erling Berge 00 6 Leverage and residuals () Deviance residual can be found from Y j mj Y j d j =± Yj ln + ( mj Yj) ln mpˆ ( ˆ j j mj Pj) Erling Berge 00 6 Erling Berge 00 3
32 Two Chi-square statistics Pearson Chi-square statistics Deviance Chi-square statistics Equations are the same for both cases and patterns χ χ J P = rj j= J D = d j j= Erling Berge The Chisquare statistics Both Chisquare statistics:. Pearson-Chisquare χ P and. Deviance-Chisquare χ D Can be read as a test of the null hypothesis of no difference between the estimated model and a saturated model, that is a model with as many parameters as there are cases/ patterns Erling Berge Erling Berge 00 3
33 Large values of measures of influence Measures of influence based on changes in (Δ) the statistic/ parameter value due to excluded cases with pattern j ΔB j delta B - analogue to Cook s D Δχ P(i) delta Pearson-Chisquare Δχ D(i) delta Deviance-Chisquare Erling Berge What is a large value of Δχ P(i) and Δχ D(i) Both Δχ P(i) and Δχ D(i) measure how badly the model fits the pattern j. Large values indicates that the model would fit the data much better if all cases with this pattern were excluded Since both measures are distributed asymptotically as the chisquare distribution, values larger than 4 indicate that a pattern affects the estimated parameters significantly Erling Berge Erling Berge 00 33
34 ΔB j delta B Measures the standardized change in the estimated parameters (b k ) that obtain when all cases with a given pattern j are excluded Δ B = j rh j j ( h ) Larger values means larger influence ΔB j >= must in any case be seen as large influence j Erling Berge delta B (with caseno) 0,70 99,00 0,60 0,50 96,00 deltab_m 0,40 0,30 66,00 56,00 65,00 0,0 0,0 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge Erling Berge 00 34
35 Δχ P(i) Delta Pearson Chisquare Measures the reduction in Pearson χ that obtains from excluding all cases with pattern j Δ χ = P( j) r j ( h ) j Erling Berge Delta Pearson Chisquare (with delta B) 30,00,44 5,00 mdeltapearsonkjikv 0,00 5,00 0,00,6366, ,00,08437,04734,038,697 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge Erling Berge 00 35
36 Δχ D(i) Delta Deviance Chisquare Measures changes in deviance that obtains j from excluding all Δ χd( j) = cases with pattern j hj This is equivalent to Δ D( j ) = LLK LLK(j) d ( ) χ LL K is the LogLikelihood of a model with K parameters estimated on the whole sample and LL K(j) is from the estimate of the same model when all cases with pattern j are excluded Erling Berge 00 7 Delta Deviance Chisquare (with delta B) 7,00,44 6,00,6366 5,00,04734 mdeltaavvikskjikv 4,00 3,00,08437,038,33699,00,00 0,00 0, ,0000 0, , ,80000,00000 Predicted probability Erling Berge 00 7 Erling Berge 00 36
37 Influence of excluded cases/patterns Variables in the model lived educ contam hsc nodad Constant *LL(modell) Sample -,040 -,97,99,79 -,73,8-4,65 Logit coefficient Excluding case 99 Δχ P(i) =8,34 -,045 -,4,490,49 -,889,575-35,45 Excluding case 96 Δχ P(i) =9,0 -,05 -,4,38,347 -,658,530-36,4 Erling Berge Influence of excluded cases/patterns y=/(+exp(-( x ))) y=/(+exp(-( x ))) y=/(+exp(-( x ))) Erling Berge Erling Berge 00 37
38 Conclusions () Ordinary OLS do not work well for dichotomous dependent variables since It is impossible to obtain normally distributed errors or homoscedasticity, and since The model predicts probabilities outside the interval [0-] The Logit model is for theoretical reasons better Likelihood ratio tests statistic can be used to test nested models analogous to the F-statistic In large samples the chisquare distributed Wald statistic [or the normally distributed t=sqrt(wald)] will be able to test single coefficients and provide confidence intervals There is no statistic similar to the coefficient of determination Erling Berge Conclusions () Coefficient of estimated models can be interpreted by. Log-odds (direct interpretation). Odds 3. Odds ratio 4. Probability (conditional effect plot) Non-linearity, case with influence, and multicollinearity leads to the same kinds of problems as in OLS regression (inaccurate or uncertain parameter values) Discrimination leads to problems of uncertain parameter values (large variance estimates) Diagnostic work is important Erling Berge Erling Berge 00 38
Binary Logistic Regression
The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b
More informationRef.: Spring SOS3003 Applied data analysis for social science Lecture note
SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton
More informationLogistic Regression. Continued Psy 524 Ainsworth
Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationChapter 19: Logistic regression
Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog
More informationChapter 19: Logistic regression
Chapter 19: Logistic regression Smart Alex s Solutions Task 1 A display rule refers to displaying an appropriate emotion in a given situation. For example, if you receive a Christmas present that you don
More information7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).
1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that
More informationClass Notes: Week 8. Probit versus Logit Link Functions and Count Data
Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While
More informationMULTINOMIAL LOGISTIC REGRESSION
MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic
More informationDependent Variable Q83: Attended meetings of your town or city council (0=no, 1=yes)
Logistic Regression Kristi Andrasik COM 731 Spring 2017. MODEL all data drawn from the 2006 National Community Survey (class data set) BLOCK 1 (Stepwise) Lifestyle Values Q7: Value work Q8: Value friends
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationEDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.
EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight
More informationUsing the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.
Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 This handout steals heavily
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationAnalysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable
More informationHierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!
Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter
More informationExperimental Design and Statistical Methods. Workshop LOGISTIC REGRESSION. Jesús Piedrafita Arilla.
Experimental Design and Statistical Methods Workshop LOGISTIC REGRESSION Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Logistic regression model Logit
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationGeneralized linear models
Generalized linear models Outline for today What is a generalized linear model Linear predictors and link functions Example: estimate a proportion Analysis of deviance Example: fit dose- response data
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationPsychology Seminar Psych 406 Dr. Jeffrey Leitzel
Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting
More informationESP 178 Applied Research Methods. 2/23: Quantitative Analysis
ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and
More information1. BINARY LOGISTIC REGRESSION
1. BINARY LOGISTIC REGRESSION The Model We are modelling two-valued variable Y. Model s scheme Variable Y is the dependent variable, X, Z, W are independent variables (regressors). Typically Y values are
More informationLOGISTICS REGRESSION FOR SAMPLE SURVEYS
4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information
More informationModels for Binary Outcomes
Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More information11. Generalized Linear Models: An Introduction
Sociology 740 John Fox Lecture Notes 11. Generalized Linear Models: An Introduction Copyright 2014 by John Fox Generalized Linear Models: An Introduction 1 1. Introduction I A synthesis due to Nelder and
More informationCount data page 1. Count data. 1. Estimating, testing proportions
Count data page 1 Count data 1. Estimating, testing proportions 100 seeds, 45 germinate. We estimate probability p that a plant will germinate to be 0.45 for this population. Is a 50% germination rate
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationDescription Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
Title stata.com logistic postestimation Postestimation tools for logistic Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationChapter 15 - Multiple Regression
15.1 Predicting Quality of Life: Chapter 15 - Multiple Regression a. All other variables held constant, a difference of +1 degree in Temperature is associated with a difference of.01 in perceived Quality
More informationLogistic Regression. Building, Interpreting and Assessing the Goodness-of-fit for a logistic regression model
Logistic Regression In previous lectures, we have seen how to use linear regression analysis when the outcome/response/dependent variable is measured on a continuous scale. In this lecture, we will assume
More informationSingle-level Models for Binary Responses
Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationEDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS
EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS The data used in this example describe teacher and student behavior in 8 classrooms. The variables are: Y percentage of interventions
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More information2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationGeneralized Linear Models
York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationProcedia - Social and Behavioral Sciences 109 ( 2014 )
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 09 ( 04 ) 730 736 nd World Conference On Business, Economics And Management - WCBEM 03 Categorical Principal
More informationSPSS LAB FILE 1
SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More information13.1 Categorical Data and the Multinomial Experiment
Chapter 13 Categorical Data Analysis 13.1 Categorical Data and the Multinomial Experiment Recall Variable: (numerical) variable (i.e. # of students, temperature, height,). (non-numerical, categorical)
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationCh 6: Multicategory Logit Models
293 Ch 6: Multicategory Logit Models Y has J categories, J>2. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. In R, we will fit these models using the
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationReview: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:
Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic
More informationChapter 11. Regression with a Binary Dependent Variable
Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score
More informationRegression Model Building
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated
More informationChapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a
Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.
More information1. Hypothesis testing through analysis of deviance. 3. Model & variable selection - stepwise aproaches
Sta 216, Lecture 4 Last Time: Logistic regression example, existence/uniqueness of MLEs Today s Class: 1. Hypothesis testing through analysis of deviance 2. Standard errors & confidence intervals 3. Model
More informationCohen s s Kappa and Log-linear Models
Cohen s s Kappa and Log-linear Models HRP 261 03/03/03 10-11 11 am 1. Cohen s Kappa Actual agreement = sum of the proportions found on the diagonals. π ii Cohen: Compare the actual agreement with the chance
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationMaking sense of Econometrics: Basics
Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/
More information2. We care about proportion for categorical variable, but average for numerical one.
Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is
More informationChapter 22: Log-linear regression for Poisson counts
Chapter 22: Log-linear regression for Poisson counts Exposure to ionizing radiation is recognized as a cancer risk. In the United States, EPA sets guidelines specifying upper limits on the amount of exposure
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationRegression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur
Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using
More informationReview of Multinomial Distribution If n trials are performed: in each trial there are J > 2 possible outcomes (categories) Multicategory Logit Models
Chapter 6 Multicategory Logit Models Response Y has J > 2 categories. Extensions of logistic regression for nominal and ordinal Y assume a multinomial distribution for Y. 6.1 Logit Models for Nominal Responses
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationActivity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression
Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear
More informationPoisson Regression. Gelman & Hill Chapter 6. February 6, 2017
Poisson Regression Gelman & Hill Chapter 6 February 6, 2017 Military Coups Background: Sub-Sahara Africa has experienced a high proportion of regime changes due to military takeover of governments for
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationChecking model assumptions with regression diagnostics
@graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationSTAC51: Categorical data Analysis
STAC51: Categorical data Analysis Mahinda Samarakoon April 6, 2016 Mahinda Samarakoon STAC51: Categorical data Analysis 1 / 25 Table of contents 1 Building and applying logistic regression models (Chap
More informationChapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.
Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationSimple Linear Regression
Simple Linear Regression 1 Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable Y (criterion) is predicted by variable X (predictor)
More informationAnalysing categorical data using logit models
Analysing categorical data using logit models Graeme Hutcheson, University of Manchester The lecture notes, exercises and data sets associated with this course are available for download from: www.research-training.net/manchester
More informationModel Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)
Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV
More informationAny of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.
STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed
More informationEconometrics 1. Lecture 8: Linear Regression (2) 黄嘉平
Econometrics 1 Lecture 8: Linear Regression (2) 黄嘉平 中国经济特区研究中 心讲师 办公室 : 文科楼 1726 E-mail: huangjp@szu.edu.cn Tel: (0755) 2695 0548 Office hour: Mon./Tue. 13:00-14:00 The linear regression model The linear
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More informationThe Logit Model: Estimation, Testing and Interpretation
The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 Introduction to maximum likelihood estimation 1.1 The likelihood function Consider a random sample Y 1,...,
More informationIntroduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017
Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent
More informationRegression Diagnostics Procedures
Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the
More informationBIOS 625 Fall 2015 Homework Set 3 Solutions
BIOS 65 Fall 015 Homework Set 3 Solutions 1. Agresti.0 Table.1 is from an early study on the death penalty in Florida. Analyze these data and show that Simpson s Paradox occurs. Death Penalty Victim's
More informationTwo Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests
Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a
More information