ADEL AHMED BABTAIN Department of General Studies, Yanbu University College, Saudi Arabia

Size: px
Start display at page:

Download "ADEL AHMED BABTAIN Department of General Studies, Yanbu University College, Saudi Arabia"

Transcription

1 USING BINARY LOGISTIC REGRESSION ND PROBIT ANALYSIS TO MODEL TEACHER ESTIMATES OF TALENTED AND GIFTED STUDENTS CHARACTERISTICS WITH THE IDENTIFICATION RESULTS OF MENTAL ABILITY TEST: A COMPARATIVE STUDY ADEL AHMED BABTAIN Department of General Studies, Yanbu University College, Saudi Arabia adel.babtain@yuc.edu.sa The study aimed to compare binary logistic regression with binary probit analysis in terms of goodness-of-fit, measures of practical significance, prediction ability, and coefficients interpretation. The study relied on correlational methodology and a population of 341 fifth grade male students of Jeddah City educational area who were nominated for gifted programs. The sample was all valid cases which were 292 (86% of the population). The study used two tools to collect data: Renzulli s Scale for Rating Behavioral Characteristics of Superior Students (SRBCSS) and the Saudi National Test of Mental Abilities (SNTMA). The findings showed that binary logistic regression and binary probit analysis display identical results, practical measures of both models are also identical, and RL 2 could be the best measure for practical significance because of its similarity with R2 used in ordinary least squares (OLS) regression analysis. The study showed that coefficient interpretations of the two models are slightly different. Binary logistic regression displayed more ability to interpret coefficient meaningfully compared with binary probit analysis. The study also found that only creativity characteristic is statistically significant to explain and predict the identification of gifted students while other characteristics (learning, motivation and leadership) are statistically insignificant. The study recommended that more comparative investigations involving logistic regression and probit analysis especially with more spread distributions should be carried out. It also recommended that the full scale of teacher estimates of students characteristics of talented and gifted students should be neglected and suggests to exclusively relying on creativity characteristic scale instead. 1 Introduction Linear regression models provide a popular device for organizing data analysis in which researchers focus on the explanation of a dependent variable, Y, as a function of multiple 1

2 independent variables, from X 1 to X k. However, when linear regression is applied with the dichotomous (binary) dependent variable, linearity assumption will be violated and some mathematical transformations should be applied to linearize the relationship between dependent and independent variables (Menard, 2002, p.5). Although with the dichotomous dependent variable, it is possible to code the two values with any numbers, employing values of 1 and 0 has advantages. In such case, the mean of the dummy variable equals the proportion of cases with value 1 and can be interpreted as the probability of having 1 (a specific event or characteristic) (Wright, 1996; Wolfe, 2002; Poston, 2004). 2 Background 2.1 Logit and probit transformation Although many nonlinear functions can represent the S-shaped curve, the logistic (logit) and probit transformations have become popular (Pample, 2000, p.10; Walker, 1998; Aldreich & Nelson, 1984). Given that the dummy dependent variable represents the probability P i of an event (with the dependent variable equaling 1), the logit transformation involves two steps (Guido, Winter & Rains, 2006). First, take the ratio of P i to 1-P i, or the odds of experiencing the event. Then, take the natural logarithm of the odds. The logit thus equals: L =ln ( ) (1) or shortly, the logged odds. This way the logit transformation straightens out the nonlinear relationship between X and the original probabilities of Y (Pample, 2000, p.10). Probit analysis transforms probabilities of an event into scores from the cumulative standard normal distribution rather than into logged odds from the logistic distribution (Pample, 2000, p.54). In a standard normal curve table, the table matches Z scores (theoretically ranging from negative infinity to positive infinity, but in practice from -3 to 3) with a proportion of the area under the curve between the absolute value of the Z score and the mean Z score of 0. With simple calculations, the standard normal table also identifies the proportion of the area from negative infinity to Z score. The proportion of the curve at or below each of the Z scores defines the cumulative standard normal distribution. Since the proportion equals the probability that falls at or below that z score, larger Z scores define greater probabilities in the cumulative standard normal distribution (Pample,2000,p54-55). Conversely, just as any Z score defines a probability in the cumulative standard normal 2

3 distribution, any probability in the cumulative standard normal distribution translates into a Z score. In sums, the cumulative standard normal curve resembles the logistic curve, only with Z scores instead of logged odds along the horizontal axis (Pample, 2000, p.56). 2.2 Practical significance measures Although the dependent variable in logistic regression and probit analysis does not have variance in the same way continuous variables do in OLS regression, maximum likelihood procedures provide model fit measures analogous to those from least squares regression. In logistic regression and probit analysis, the baseline log likelihood (L0) times -2 represents the likelihood of producing the observed data with parameters for the independent variables equaling zero, and corresponding to the total sum of squares. The model log likelihood (LM) times -2 represents the likelihood of producing the observed data with the estimated parameters for the independent variables, and corresponds to the error sum of squares in the OLS regression. The improvement relative to the baseline in the log likelihood model shows the improvement due to the independent variables. Accordingly, these two log likelihoods define an analogy to a proportional reduction in the error measure in regression: 2 R 2 log L0 ) ( 2 log LM )] / ( 2 log ( )) = [( L (2) 0 The numerator shows the reduction in the model error due to the independent variables, and the denominator shows the error without using the independent variables. The resulting value shows the improvement in the log likelihood relative to the baseline. It equals 0 when all the coefficients equal 0, and has a maximum that comes close to 1 when independent variables completely determine and explain the dependent variable. However, the measure does not represent explained variance since log likelihood functions do not deal with variance defined as the sum of squared deviations. This and similar measures are therefore referred to as the pseudo-variance explained or pseudo R 2 (Pample, 2000, p.49). 2.3 Coefficients Interpretation In linearizing the nonlinear relationships, logistic regression shifts the interpretation of coefficients from changes in probabilities to less intuitive changes in logged odds (Dallal, 2001). The loss of interpretability with the logistic coefficients, however, is balanced by the gain in parsimony: the linear relationship with the logged odds can be summarized with a single coefficient, but the nonlinear relationship with the probabilities cannot be so simply summarized (Pample, 2000, p.18; Cizek & Fitzgerald, 1999). 3

4 On the other hand, probit coefficients show the linear and additive change in Z-score units of the probit transformation (i.e., the inverse of the cumulative standard normal distribution) for one-unit change in the independent variables (Liao, 1994, p.21). Perhaps even less intuitive than the logged odds, standard units of the cumulative normal distribution have little interpretive value (Pample, 2000, p.60). 3 Study purpose The purpose of this study is to answer the following research questions: 1. To what extent do binary logistic regression and binary probit analysis fit behavioral characteristics with giftedness identification? 2. What are the measures of practical significance in both techniques? 3. Are there differences in independent variables abilities in logistic regression and probit analysis? 4. How are logistic regression and probit analysis coefficients interpreted? 4 Method 4.1 Population and sample The study population was 5 th grade male students of Jeddah City educational area who were nominated for gifted programs (341 students) and the sample was all the 292 valid cases of the population (86% of the population). 4.2 Instruments The study relied on two tools: Renzulli s Scale for Rating Behavioral Characteristics of Superior Students (SRRBCSS) and Saudi National Test of Mental Abilities (SNTMA). The SRRBCSS version used in this study consists of four dimensions (leadership, creativity, motivation and learning) and was standardized to be valid for Saudi Arabia and Bahrain by Clinton (1988) and Maa jeni et al (1995). The researcher verified the tool validity and reliability by analyzing 50 cases of the sample and the findings were: 4

5 Table 1. Researchers verification of SRRBCSS reliability Dimension Number of items Cronbach s α Leaving Motivation Creativity Leadership The table shows that all dimensions of the scale have reasonable reliability coefficients. Also, the researcher tested the internal consistency by calculating the correlation between each dimension and total score on SRRBCSS. Item Table 2. Internal consistency coefficients for the scale dimensions Creativity Leadership Motivation Learning Correlation Correlation Correlation coefficient with Item coefficient with Item coefficient with Item the total ** the total ** the total ** Correlation coefficient with the total ** ** All correlation coefficients are statistically significant at 0.01 The findings show that all scale dimensions have high internal consistency coefficients. Both findings in table 1 and table 2 prove that SRRBCSS tool has an acceptable level of reliability. The second instrument used in the study was SNTMA developed by Al-Share et al (2001). The test includes 81 items covering four abilities (dimensions): verbal (24items), numerical (26 items), spatial (19 items), and reasoning abilities (18 items). The test was approved by the Saudi Ministry of Education as a formal test to indentify students giftedness across the country. The test shows the following psychometric measures (Al-Share et al, 20, p.25): 1. α coefficient for the four dimensions were ranged from 0.77 to 0.88 and the α coefficient for the entire test was

6 2. Correlation coefficients for each dimension with the achievement scores were from 0.21 to 0.43 and the coefficient of the entire test was The correlation coefficient for each dimension with the adapted Wechsler test (the Saudi version) was as follows: with verbal ability (0.75), with numerical ability (0.63), with reasoning ability (0.53), and with spatial ability (0.57). Also, the correlations with Wechsler test (application part) were 0.59, 0.48, 0.47, and 0.55 respectively and the correlation of the entire test with Wechsler was The construct validity was verified by testing the statistical significance of the differences in SNTMA means across different age categories (from 9 to 16 years). Tests show statistically significant differences among SNTMA means according to ages and an increase according to age increase. Moreover, loadings of all dimensions on verbal and nonverbal (performance) sections of Wechsler test were significant, which means that both tests measure one factor, general intellectual ability (Al-Share et al, 2001, p.30). These findings show that SNTMA has very good psychometric features and valid to be used to identify Saudi gifted students. 5 Results 5.1 The first research question To what extent do binary logistic regression and binary probit analysis fit behavioral characteristics with giftedness identification? To answer this question, the researcher modeled data using both logistic regression and probit analysis and test the null hypothesis indicating that all behavioral characteristics coefficients equal zero and found the following: Table 3. Significance levels for logistic regression and probit analysis models Model* df Significance Logistic regression Probit analysis *each model includes creativity, leadership, motivation and learning characteristics as independent variables and the category of students (gifted or ungifted) as the dependent variable. The table shows that for the logistic regression the probability of obtaining x statistic equals is equal to 0.06, given that the null hypothesis is true (there is no effect of independent variables on the dependent variable). So, the overall logistic regression model is statistically significant because p-value is less than

7 Also, the table shows that the probability of obtaining χ statistic equals in the probit analysis is equal to given that the null hypothesis is true (there is no effect of independent variables on the dependent variable). This means that the overall probit analysis is statistically significant because p-value is less than In sum, both logistic regression and probit analysis that include creativity, leadership, motivation and learning characteristics as independent variables and the category of students (gifted or ungifted) as the dependent variable are statistically significant and at least one of the independent variable coefficients is not equal to zero and contributes in indentifying gifted students. 5.2 The second research question What are the measures of practical significance in both techniques? To answer this question, three practical significance measures were generated as shown in the table below: Table 4. Measures of practical significance of logistic regression and probit analysis Pseudo R 2 Logistic regression Probit analysis McFadden R 2 ( R ) L Cox and Snell ( R ) M Nagelkerke ( R ) N McFadden R 2, Cox and Snell R 2 and Nagelkerke R 2 are attempts to provide a logistic and probit analogy to R 2 in the ordinary least squares regression. Findings show that pseudo R 2 measures for logistic regression range from 3.6% to 6.5% and for probit analysis range from 3.5% to 6.3%. These percentages reflect the reductions in the log likelihood of the models due to including leadership, motivation, creativity and learning characteristics variables. 5.3 The third research question Are there differences in independent variables abilities in logistic regression and probit analysis? The following table shows independent variable coefficients and corresponding Wald Statistics of logistic regression and probit analysis. 7

8 Table 5: Wald tests for models coefficients Model Logistic regression Probit analysis Independent variable coefficient b Standard error Wald statistic df Sig. 95% Confidence interval Lower boundary Upper boundary creativity leadership motivation learning constant creativity leadership motivation learning Constant For logistic regression, Wald statistics show that only creativity characteristic was statistically significant at significance level. This means that only creativity has a significant impact on identifying gifted students. On the other hand, probit analysis shows that none of the independent variables (creativity, leadership, motivation and learning) is statistically significant, even creativity which is statistically significant in the logistic regression model. 5.4 The fourth research question How are logistic regression and probit analysis coefficients interpreted? In order to answer this question, the following ways of interpretations are used: The easiest way to interpret logistic regression coefficients is through logit. Creativity logit coefficient equals which means that with controlling the effects of other independent variables, the log odds of being gifted increases by for each unit increase in the creativity characteristic in the SRRBCSS. There is another way in which coefficients might be interpreted in the light of odds rather than the log odds. Thus, creativity logit coefficient could be transformed into creativity odds coefficient by taking the exponent logit coefficient and odds coefficient will be equal to 8

9 e b which equals in case of creativity variable e (=1.082). This means that for each unit increase in the creativity characteristic of SRRBCSS, odds of being a gifted student will be equal to the coefficient value times the factor of Also, this means that for an increase in the creativity characteristic by one unit, the odds of being a gifted student will increase by 0.82%. So, for any independent X and odds (O X ), if X increases by one unit to be X+1, Odds (O X+1 ) will equal O X times e b. So, the ratio of Odds (X+1) to Odds (X) is equal to: Odds Odds ( X + 1) ( X ) b Odds( X ) e b = = e Odds ( X ) (3) Thus, e b coefficient, called odds coefficient, could be known as odds ratio and abbreviated as OR. Finally, since the mathematical relationship between odds and probabilities is very well defined, researchers could conduct a simple calculation to convert odds into probabilities. This way, the effects on logged odds or odds could be translated into the effects on probabilities. However, since the relationship between the independent variables and probabilities is nonlinear, these effects on probabilities could not be represented by a single coefficient value and they have to be determined at a particular value of the independent variable. Probit coefficients, also, could be interpreted in different ways. The easiest way among these will be the direct interpretation through the probit transformation. Given that the probit formula is: (4) = + So, the probit coefficient from table 5 equals 0.047, this means that for one-unit increase in creativity characteristics, being identified as a gifted student increases by units of probit or the inverse of cumulative standard normal function. Also, since the relationship between probit and probabilities is well defined through cumulative standard normal distribution, probit coefficients could be interpreted as effects of independent variables on probabilities. However, similar to the case of logistic regression, there would be no single value of coefficient summaries the effects on probabilities at all levels of independent variable/variables and researches should explain the effects of independent variables change on the probabilities of dependent variable very carefully and only at certain levels of independent variables. 9

10 6 Discussion Both Logistic regression and probit analysis models fit the data of rating behavioral characteristics of superior students with the giftedness identification and even though the ways of linearity transformation were different, they gave essentially almost equivalent results. The measures used in both analyses to assess practical significance also showed identical results. Both models built in this study displayed a low level of practical significance even though the overall models were statistically significant. Logistic regression and probit analysis gave many measures of practical significance similar to R 2 in the OLS regression, but none of them is identical in its meaning to what OLS regression has. R 2 measures in logistic regression and probit analysis do not mean in any way the proportion of dependent variable variance explained by the independent variables of the model and simply that s why they are known as pseudo R 2. The logistic regression model showed that only creativity characteristic has a statistical significance and influence on identification of giftedness. On the other hand, probit analysis did not succeed to display the statistical significance of any independent variables to explain and predict the identification of gifted students. This finding might highlight a slight difference between logistic regression and probit analysis even though most statisticians believe that both analyses essentially give identical results to the extent they think that the choice between logistic regression and probit analysis is a matter of individual preferences (Pample, 2000, p.54). As Pample (2000) mentioned that because of the differences in coefficients result from the different variances of the transformed dependent variables, most Z scores in the probit analysis are slightly larger than those of the logistic regression and that s why creativity characteristic in the probit analysis was supposed to reach significance level easier than logistic regression (p.66). However, what Pample expected was not what occurred in this study and this is why researchers should be very careful when significance level values are very close to the critical values. The researcher agrees with Pample (2000) to conclude that the logistic regression and probit analysis coefficients vary slightly because of the small differences between the logistic and normal curves and almost both probit analysis and logistic regression produce similar substantive results (p.60). Logistic regression coefficients interpretation showed that the linear relationship between independent variables and the logged odds helps to summarize the impacts of independent variables on the dependent one with a single coefficient value. However, this linearizing transformation shifts the interpretation of coefficients from changes in probabilities to less intuitive changes in logged odds. 10

11 Probit coefficients show the linear and additive change in Z-score units of the probit transformation, but perhaps even less intuitive than the logged odds because standard units of the cumulative standard normal distribution have little interpretative values. Further, probit analysis does not allow calculation of equivalent of odds ratios and makes the calculation of changes in probabilities more difficult than in logistic regression (Pample,2000). That is why Pample (2000,p.68) said that in most circumstances, researchers will prefer logistic regression (p.68). He also said that given the usefulness of multiplicative odds coefficients in logistic regression, the lack of comparable coefficients in probit analysis may contribute to the greater popularity of logistic regression (p.61). Finally, probit and logistic regression results show that the logistic regression coefficients exceed the corresponding probit coefficients by a factor varying from 1.5 to 2.2 (Pample,2000,p.66). Part of the differences in coefficients results from the different variances of the transformed dependent variables. Most Z scores for the probit analysis are slightly larger than those for the logistic regression (or more precisely, than the square root of the Wald statistic). So, Z score of some independent variables in the probit analysis might reach the 0.05 level of significance while it does not in the logistic regression (Pample, 2000, p.67). 7 Conclusion Logistic regression and probit analysis use different techniques to transform the nonlinearity of dependent variable to become linear with independent variables. These transformations cause quite similar but not identical results in both analyses. The main advantage of logistic regression comparing with probit analysis is the various ways of coefficient interpretations which are more intuitive. Also, the study showed that only creativity characteristic of the SRBCSS is statistically significant in identifying students giftedness and that s why this study recommends to use just the creativity part of SRBCSS instead of using the full scale to identify gifted students. Finally, this study recommends carrying out more comparative investigations between logistic regression and probit analysis especially with different types of dependent variable distributions to uncover any possible systematic differences between the two analyses especially its conservatism level. References 1. Aldrich, John H. and Nelson, Forrest D. (1984). Linear Probability, Logit, and Probit Models. Sage University Paper series on Quantitative Applications in the Social Sciencies. No ). Beverly Halls, CA: Sage. 11

12 2. Al-Share et al (1999). Identifying and caring of gifted students (In Arabic). Riyadh: King Abdul-Aziz City for Science and Technology. 3. Al-Share et al (2001). Identifying and detecting gifted students (In Arabic). Riyadh: King Abdul-Aziz City for Science and Technology. 4. Borooh, Vani K. (2002). Logit and Probit: Ordered and Multinomial Models. Sage University Paper series on Quantitative Applications in the Social Sciencies. No ). Beverly Halls, CA: Sage. 5. Breslow, Norman and Holubkov, Richard (1997). Maximum Likelihood Estimation of Logistic Regression Parameters under Two-phase, Outcome-dependent Sampling. Royal Statistical Society. Vol.59, No.2,pp Cizek, Gregory J. & Fitzgerald, Shawn M. (1999). Methods, Plainly Speaking: An Introduction to Logistic regression. Measurement & Evaluation in Counseling and Development. Vol.31, January, Clinton, Abdul-Rahman Noor-Addin (1998). Scale for Rating Behavioral Characteristics of Superior Students (SRBCSS). Unpublished paper (In Arabic). 8. Dallal, Gerard E. (2001). Logistic Regression. Available at: 9. Draper, N. R. & Smith, H. (1981). Applied Regression Analysis. 2nd edition. New York: John Wiley & Sons. 10. Eliason, Scott R. (1993). Maximum Likelihood Estimation Logic and Practice. Sage University Paper series on Quantitative Applications in the Social Sciencies. No ). Beverly Halls, CA: Sage. 11. Fraas, John W. and Newman, Isadore (2003). Ordinary Least Squares Regression, Discriminant Analysis, and Logistic Regression: Questions Researchers and Practitioners Should Address When Selecting an Analytic Technique. Paper Presented at the Annual Meeting of the Eastern Educational Research Association (Hilton Head Island, GA, February 26-March 1,2003). 12. Fraas, John W.; Drushal, J. Michael; Graham, Jeff (2002). Expressing Logistic Regression Coefficients as Change in Initial Probability Values: Useful Information for Practitioners. Paper Presented at the Annual Meeting of the Mid-Western Educational Research Association (Columbus, Ohio, October 16-19,2002). 13. Gebotys, Robert (2000). Examples: Binary Logistic Regression. January, Guido, Joseph J., Winters, Paul C. & Rains, Adam B.(2006). Logistic regression Basics. University of Rochester Medical Center, Rochester, NY. Avialable at: Hanneman, Robert (w.d.). Multivariate Analysis. Department of Scociology. University of California. 16. Horton, Nicholas J. and Laird, Nan M. (2001). Maximum Likelihood Analysis of Logistic Regression Models with Incomplete Covariate Data and Auxiliary Information. Biometrics. Vol.57, pp.34-42, March

13 17. Hosmer, David W. & Lemeshow, Stanely (2000). Applied Logistic Regression. 2nd edition. New York: Johnson Wiley & Sons, Inc. 18. Houston, Walter M. & Woodruff, David J. (1997). Empirical Bayes Estimates of Parameters from the Logistic Regression Model. ACT Research Report Series American Coll. Testing Program, Iowa City, IA. 19. Johnson, Wesley & Watnik, Mitchell (2002). Interpretation of Regression Output: Diagnostics, Graphs & the Botton Line. University of California, USA. 20. Kerlinger, Fred N. & Pedhazur, Elazar(1973). Multiple Regression Behavioral Research. New York: Holt, Rinehart and Winston, Inc. 21. Kerlinger, Fred N. (1973). Foundations of Behavioral Research. 2nd edition. New York: Holt, Rinehart and Winston, Inc. 22. King, Gary and Zeng, Langehe (2001). Logistic Regression in Rare Events Data. Society for Political Methodology. February 16, King, Jason E. (2002). Logistic Regression: Going beyond Point-and-Click. Paper Presented at the Annual Meeting of the American Educational Research Association (New Orleans, LA, April 1-5,2002). 24. King, Jason E. (2003). Running A Best-Subsets Logistic Regression: An Alternative to Stepwise Methods. Educational and Psychological Measurement. Vol.63, No.3, June 2003, Kleinbaum, David & Klein, Mitchel (2002). Logistic Regression: a Self-learning Teext. USA: Springer. 26. Larson, Ray R. (2002). A Logistic Regression Approach to Distributed IR. University of California, Berkeley. School of Information Management and Systems. SIGIR 02, Tamper, Finland, August 11-15, Lea, Stephen (1997). Multivariate Analysis II: Manifest Variables Analysis. Topic 4: Logistic Regression and Discriminant Analysis. University of EXETER, Department of Psychology. Revised 11th March, Available at: Liao, Tim Futing (1994). Interpreting Probability Models Logit, Probit, and Other eneralized Linear Models. Sage University Paper series on Quantitative Applications in the Social Sciencies. No ). Beverly Halls, CA: Sage. 29. Maajeni, Osama Hasan & Howidi, Mohammad Abdulraziq (1995). Differences between Superior and Normal Students on Scale for Rating Behavioral Characteristics of Superior Students in the Primary Schools of Bahrain (In Arabic). Kuwait: Education Journal, No. 35, Vol. 9; Menard, Scott (2002). Applied Logistic Regression Analysis. 2nd edition. Sage University Paper series on Quantitative Applications in the Social Sciencies. No ). Beverly Halls, CA: Sage. 13

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION

ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION ONE MORE TIME ABOUT R 2 MEASURES OF FIT IN LOGISTIC REGRESSION Ernest S. Shtatland, Ken Kleinman, Emily M. Cain Harvard Medical School, Harvard Pilgrim Health Care, Boston, MA ABSTRACT In logistic regression,

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Logistic Regression. Continued Psy 524 Ainsworth

Logistic Regression. Continued Psy 524 Ainsworth Logistic Regression Continued Psy 524 Ainsworth Equations Regression Equation Y e = 1 + A+ B X + B X + B X 1 1 2 2 3 3 i A+ B X + B X + B X e 1 1 2 2 3 3 Equations The linear part of the logistic regression

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Procedia - Social and Behavioral Sciences 109 ( 2014 )

Procedia - Social and Behavioral Sciences 109 ( 2014 ) Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 09 ( 04 ) 730 736 nd World Conference On Business, Economics And Management - WCBEM 03 Categorical Principal

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

LOGISTICS REGRESSION FOR SAMPLE SURVEYS

LOGISTICS REGRESSION FOR SAMPLE SURVEYS 4 LOGISTICS REGRESSION FOR SAMPLE SURVEYS Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi-002 4. INTRODUCTION Researchers use sample survey methodology to obtain information

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU.

SOS3003 Applied data analysis for social science Lecture note Erling Berge Department of sociology and political science NTNU. SOS3003 Applied data analysis for social science Lecture note 08-00 Erling Berge Department of sociology and political science NTNU Erling Berge 00 Literature Logistic regression II Hamilton Ch 7 p7-4

More information

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a Chapter 9 Regression with a Binary Dependent Variable Multiple Choice ) The binary dependent variable model is an example of a a. regression model, which has as a regressor, among others, a binary variable.

More information

Correlation and regression

Correlation and regression 1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

More information

MULTINOMIAL LOGISTIC REGRESSION

MULTINOMIAL LOGISTIC REGRESSION MULTINOMIAL LOGISTIC REGRESSION Model graphically: Variable Y is a dependent variable, variables X, Z, W are called regressors. Multinomial logistic regression is a generalization of the binary logistic

More information

Advanced Quantitative Data Analysis

Advanced Quantitative Data Analysis Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose

More information

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS

A COEFFICIENT OF DETERMINATION FOR LOGISTIC REGRESSION MODELS A COEFFICIENT OF DETEMINATION FO LOGISTIC EGESSION MODELS ENATO MICELI UNIVESITY OF TOINO After a brief presentation of the main extensions of the classical coefficient of determination ( ), a new index

More information

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University

Calculating Effect-Sizes. David B. Wilson, PhD George Mason University Calculating Effect-Sizes David B. Wilson, PhD George Mason University The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction and

More information

Multiple Linear Regression Viewpoints. Table of Contents

Multiple Linear Regression Viewpoints. Table of Contents Multiple Linear Regression Viewpoints A Publication sponsored by the American Educational Research Association s Special Interest Group on Multiple Linear Regression: The General Linear Model MLRV Volume

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Chapter 19: Logistic regression

Chapter 19: Logistic regression Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog

More information

Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination

Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination The Korean Communications in Statistics Vol. 13 No. 1, 2006, pp. 49-60 Information Theoretic Standardized Logistic Regression Coefficients with Various Coefficients of Determination Chong Sun Hong 1) and

More information

Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates

Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates Logistic Regression Models to Integrate Actuarial and Psychological Risk Factors For predicting 5- and 10-Year Sexual and Violent Recidivism Rates WI-ATSA June 2-3, 2016 Overview Brief description of logistic

More information

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors. EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

More information

8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

Estimated Precision for Predictions from Generalized Linear Models in Sociological Research

Estimated Precision for Predictions from Generalized Linear Models in Sociological Research Quality & Quantity 34: 137 152, 2000. 2000 Kluwer Academic Publishers. Printed in the Netherlands. 137 Estimated Precision for Predictions from Generalized Linear Models in Sociological Research TIM FUTING

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

GLM models and OLS regression

GLM models and OLS regression GLM models and OLS regression Graeme Hutcheson, University of Manchester These lecture notes are based on material published in... Hutcheson, G. D. and Sofroniou, N. (1999). The Multivariate Social Scientist:

More information

Basic Medical Statistics Course

Basic Medical Statistics Course Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable

More information

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013 Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/2013 1 Overview Data Types Contingency Tables Logit Models Binomial Ordinal Nominal 2 Things not

More information

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence

Estimating Explained Variation of a Latent Scale Dependent Variable Underlying a Binary Indicator of Event Occurrence International Journal of Statistics and Probability; Vol. 4, No. 1; 2015 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Estimating Explained Variation of a Latent

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

More information

An Overview of Item Response Theory. Michael C. Edwards, PhD

An Overview of Item Response Theory. Michael C. Edwards, PhD An Overview of Item Response Theory Michael C. Edwards, PhD Overview General overview of psychometrics Reliability and validity Different models and approaches Item response theory (IRT) Conceptual framework

More information

Simple logistic regression

Simple logistic regression Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned

More information

Using SPSS for One Way Analysis of Variance

Using SPSS for One Way Analysis of Variance Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial

More information

Longitudinal Modeling with Logistic Regression

Longitudinal Modeling with Logistic Regression Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

More information

Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi.

Using the same data as before, here is part of the output we get in Stata when we do a logistic regression of Grade on Gpa, Tuce and Psi. Logistic Regression, Part III: Hypothesis Testing, Comparisons to OLS Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised January 14, 2018 This handout steals heavily

More information

The Precise Effect of Multicollinearity on Classification Prediction

The Precise Effect of Multicollinearity on Classification Prediction Multicollinearity and Classification Prediction The Precise Effect of Multicollinearity on Classification Prediction Mary G. Lieberman John D. Morris Florida Atlantic University The results of Morris and

More information

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 16 Introduction ReCap. Parts I IV. The General Linear Model Part V. The Generalized Linear Model 16 Introduction 16.1 Analysis

More information

26:010:557 / 26:620:557 Social Science Research Methods

26:010:557 / 26:620:557 Social Science Research Methods 26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate Professor Department of Accounting & Information Systems Rutgers Business School Newark & New Brunswick 1 Overview

More information

Group comparisons in logit and probit using predicted probabilities 1

Group comparisons in logit and probit using predicted probabilities 1 Group comparisons in logit and probit using predicted probabilities 1 J. Scott Long Indiana University May 27, 2009 Abstract The comparison of groups in regression models for binary outcomes is complicated

More information

POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010

POLI 618 Notes. Stuart Soroka, Department of Political Science, McGill University. March 2010 POLI 618 Notes Stuart Soroka, Department of Political Science, McGill University March 2010 These pages were written originally as my own lecture notes, but are now designed to be distributed to students

More information

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

More information

UCLA Department of Statistics Papers

UCLA Department of Statistics Papers UCLA Department of Statistics Papers Title Can Interval-level Scores be Obtained from Binary Responses? Permalink https://escholarship.org/uc/item/6vg0z0m0 Author Peter M. Bentler Publication Date 2011-10-25

More information

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis

Modelling Academic Risks of Students in a Polytechnic System With the Use of Discriminant Analysis Progress in Applied Mathematics Vol. 6, No., 03, pp. [59-69] DOI: 0.3968/j.pam.955803060.738 ISSN 95-5X [Print] ISSN 95-58 [Online] www.cscanada.net www.cscanada.org Modelling Academic Risks of Students

More information

Estimation and sample size calculations for correlated binary error rates of biometric identification devices

Estimation and sample size calculations for correlated binary error rates of biometric identification devices Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,

More information

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Mixed Models for Longitudinal Ordinal and Nominal Outcomes Mixed Models for Longitudinal Ordinal and Nominal Outcomes Don Hedeker Department of Public Health Sciences Biological Sciences Division University of Chicago hedeker@uchicago.edu Hedeker, D. (2008). Multilevel

More information

Workshop on Statistical Applications in Meta-Analysis

Workshop on Statistical Applications in Meta-Analysis Workshop on Statistical Applications in Meta-Analysis Robert M. Bernard & Phil C. Abrami Centre for the Study of Learning and Performance and CanKnow Concordia University May 16, 2007 Two Main Purposes

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department

More information

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description

UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description UNIVERSITY OF THE PHILIPPINES LOS BAÑOS INSTITUTE OF STATISTICS BS Statistics - Course Description COURSE COURSE TITLE UNITS NO. OF HOURS PREREQUISITES DESCRIPTION Elementary Statistics STATISTICS 3 1,2,s

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Relating Latent Class Analysis Results to Variables not Included in the Analysis

Relating Latent Class Analysis Results to Variables not Included in the Analysis Relating LCA Results 1 Running Head: Relating LCA Results Relating Latent Class Analysis Results to Variables not Included in the Analysis Shaunna L. Clark & Bengt Muthén University of California, Los

More information

Introduction to Linear Regression Analysis

Introduction to Linear Regression Analysis Introduction to Linear Regression Analysis Samuel Nocito Lecture 1 March 2nd, 2018 Econometrics: What is it? Interaction of economic theory, observed data and statistical methods. The science of testing

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit R. G. Pierse 1 Introduction In lecture 5 of last semester s course, we looked at the reasons for including dichotomous variables

More information

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools Introduction to Calculus 50 Introduction to Calculus 50 BOE Approved 04/08/2014 1 INTRODUCTION TO CALCULUS 50 Critical Areas of Focus Introduction to Calculus 50 course

More information

Notes for week 4 (part 2)

Notes for week 4 (part 2) Notes for week 4 (part 2) Ben Bolker October 3, 2013 Licensed under the Creative Commons attribution-noncommercial license (http: //creativecommons.org/licenses/by-nc/3.0/). Please share & remix noncommercially,

More information

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010

Statistical Models for Management. Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon. February 24 26, 2010 Statistical Models for Management Instituto Superior de Ciências do Trabalho e da Empresa (ISCTE) Lisbon February 24 26, 2010 Graeme Hutcheson, University of Manchester GLM models and OLS regression The

More information

Multinomial Logistic Regression Models

Multinomial Logistic Regression Models Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

More information

Fitting Stereotype Logistic Regression Models for Ordinal Response Variables in Educational Research (Stata)

Fitting Stereotype Logistic Regression Models for Ordinal Response Variables in Educational Research (Stata) Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 31 11-2014 Fitting Stereotype Logistic Regression Models for Ordinal Response Variables in Educational Research (Stata) Xing Liu

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Chapter 5. Logistic Regression

Chapter 5. Logistic Regression Chapter 5 Logistic Regression In logistic regression, there is s categorical dependent variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

More information

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly

More information

176 Index. G Gradient, 4, 17, 22, 24, 42, 44, 45, 51, 52, 55, 56

176 Index. G Gradient, 4, 17, 22, 24, 42, 44, 45, 51, 52, 55, 56 References Aljandali, A. (2014). Exchange rate forecasting: Regional applications to ASEAN, CACM, MERCOSUR and SADC countries. Unpublished PhD thesis, London Metropolitan University, London. Aljandali,

More information

Introduction To Logistic Regression

Introduction To Logistic Regression Introduction To Lecture 22 April 28, 2005 Applied Regression Analysis Lecture #22-4/28/2005 Slide 1 of 28 Today s Lecture Logistic regression. Today s Lecture Lecture #22-4/28/2005 Slide 2 of 28 Background

More information

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment

Binary Choice Models Probit & Logit. = 0 with Pr = 0 = 1. decision-making purchase of durable consumer products unemployment BINARY CHOICE MODELS Y ( Y ) ( Y ) 1 with Pr = 1 = P = 0 with Pr = 0 = 1 P Examples: decision-making purchase of durable consumer products unemployment Estimation with OLS? Yi = Xiβ + εi Problems: nonsense

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2)

Week 7: Binary Outcomes (Scott Long Chapter 3 Part 2) Week 7: (Scott Long Chapter 3 Part 2) Tsun-Feng Chiang* *School of Economics, Henan University, Kaifeng, China April 29, 2014 1 / 38 ML Estimation for Probit and Logit ML Estimation for Probit and Logit

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

CHAPTER 1: BINARY LOGIT MODEL

CHAPTER 1: BINARY LOGIT MODEL CHAPTER 1: BINARY LOGIT MODEL Prof. Alan Wan 1 / 44 Table of contents 1. Introduction 1.1 Dichotomous dependent variables 1.2 Problems with OLS 3.3.1 SAS codes and basic outputs 3.3.2 Wald test for individual

More information

1. BINARY LOGISTIC REGRESSION

1. BINARY LOGISTIC REGRESSION 1. BINARY LOGISTIC REGRESSION The Model We are modelling two-valued variable Y. Model s scheme Variable Y is the dependent variable, X, Z, W are independent variables (regressors). Typically Y values are

More information

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt Giovanni Nattino The Ohio Colleges of Medicine Government Resource Center The Ohio State University Stata Conference -

More information

LOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi

LOGISTIC REGRESSION. Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi LOGISTIC REGRESSION Lalmohan Bhar Indian Agricultural Statistics Research Institute, New Delhi- lmbhar@gmail.com. Introduction Regression analysis is a method for investigating functional relationships

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model

More information

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 Statistics Boot Camp Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018 March 21, 2018 Outline of boot camp Summarizing and simplifying data Point and interval estimation Foundations of statistical

More information

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions JKAU: Sci., Vol. 21 No. 2, pp: 197-212 (2009 A.D. / 1430 A.H.); DOI: 10.4197 / Sci. 21-2.2 Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions Ali Hussein Al-Marshadi

More information

Study of the Relationship between Dependent and Independent Variable Groups by Using Canonical Correlation Analysis with Application

Study of the Relationship between Dependent and Independent Variable Groups by Using Canonical Correlation Analysis with Application Modern Applied Science; Vol. 9, No. 8; 2015 ISSN 1913-1844 E-ISSN 1913-1852 Published by Canadian Center of Science and Education Study of the Relationship between Dependent and Independent Variable Groups

More information

Course Goals and Course Objectives, as of Fall Math 102: Intermediate Algebra

Course Goals and Course Objectives, as of Fall Math 102: Intermediate Algebra Course Goals and Course Objectives, as of Fall 2015 Math 102: Intermediate Algebra Interpret mathematical models such as formulas, graphs, tables, and schematics, and draw inferences from them. Represent

More information

ECON 5350 Class Notes Functional Form and Structural Change

ECON 5350 Class Notes Functional Form and Structural Change ECON 5350 Class Notes Functional Form and Structural Change 1 Introduction Although OLS is considered a linear estimator, it does not mean that the relationship between Y and X needs to be linear. In this

More information

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

PSYC 331 STATISTICS FOR PSYCHOLOGISTS PSYC 331 STATISTICS FOR PSYCHOLOGISTS Session 4 A PARAMETRIC STATISTICAL TEST FOR MORE THAN TWO POPULATIONS Lecturer: Dr. Paul Narh Doku, Dept of Psychology, UG Contact Information: pndoku@ug.edu.gh College

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,

More information

Predicting Retention Rates from Placement Exam Scores

Predicting Retention Rates from Placement Exam Scores Predicting Retention Rates from Placement Exam Scores Dr. Michael S. Pilant, Dept. of Mathematics, Texas A&M University Dr. Robert Hall, Dept. of Ed. Psychology, Texas A&M University Amy Austin, Senior

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Statistical and psychometric methods for measurement: G Theory, DIF, & Linking

Statistical and psychometric methods for measurement: G Theory, DIF, & Linking Statistical and psychometric methods for measurement: G Theory, DIF, & Linking Andrew Ho, Harvard Graduate School of Education The World Bank, Psychometrics Mini Course 2 Washington, DC. June 27, 2018

More information

Logistic Regression Models for Multinomial and Ordinal Outcomes

Logistic Regression Models for Multinomial and Ordinal Outcomes CHAPTER 8 Logistic Regression Models for Multinomial and Ordinal Outcomes 8.1 THE MULTINOMIAL LOGISTIC REGRESSION MODEL 8.1.1 Introduction to the Model and Estimation of Model Parameters In the previous

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information