An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent.

Size: px

Start display at page:

Download "An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent."

Magnus Conley
5 years ago
Views:

1 Statistical Methods in Business Lecture 6. Binomial Logistic Regression An ordinal number is used to represent a magnitude, such that we can compare ordinal numbers and order them by the quantity they represent. A cardinal number is used to identify an entity, such that we can categorize items by a common characteristic. Hence, quantitative variables are using ordinal numbers to represent a quantity and categorical variables are using cardinal numbers to represent a characteristic (but not a magnitude). We also use a function to indicate whether the points of the universe are also a member of a set. Assume that the set A consists of some points of the universe (could be all points, too). Let x be a point of the universe. We define a characteristic function of the set A, denoted by χ #, such as: 1, If x A. χ # (%) = ' 0, If x A. Sometimes, the Indicator Function of the set A, term is also used to indicate a membership status for a point x of the universe to the set A, which is identical to the characteristic function of the set A, denoted by I #,. Thus, 1, If x A. I # (x) χ # (%) = ' 0, If x A. A Binomial Experiment generates two possible outcomes, we identify them by generic names; the SUCCESS outcome and the FAILURE outcome. When the probability measure of the SUCCESS event is assumed a constant number then this yields a constant probability distribution to the SUCCESS and FAILURE events, and this probability distribution is called a stationary distribution. Under the constant probability distribution assumption, a binomial experiment is called a Bernoulli Trial. 1 P age

2 Now imagine that we have a Bernoulli Trial, we can use aa categorical random variable (binary) to indicate the outcomes of this experiment, such that 1, If the outcome is a SUCCESS. X = ' 0, If the outcome is a FAILURE. In our textbook, this variable is called a dummy variable, since there is no quantitative information provided by this variable. However, the characteristic of the observed event is identified by this variable. We can consider any project as a Bernoulli Trial, such that when a project is completed by utilizing all the available resources at their given values, this project is considered as a SUCCESS. Otherwise, if this project cannot be completed, by utilizing all the available resources, then this project is considered a FAILURE. Every business plan makes a list of projects to be taken, together with all available resources within their capacity constraints. Therefore, it is essential to estimate the probability of SUCCESS for a project, based on all available resources for the project. A logistic function is a continuous function, representing significant change in the behavior of a population, simply represented by its graph, f(χ), and it is a member of the exponential family of functions. 2 P age

3 We can imagine that the behavior stays at the 0 level (FAILURE) if x level is less than x EFGHGE#I -. When the level of x reaches the vicinity of the critical mass, x EFGHGE#I, then a sudden change in the behavior begins, and in a short span of space, after x gets larger than x EFGHGE#I +, a new behavior pattern emerges, and the population now has a significantly different behavior level (SUCCESS) and stays at this new level after x gets larger than x EFGHGE#I +, say 1 indicates the success level. Here denotes a small, positive quantity. Thus, the logistic function is not a jump function at the point x EFGHGE#I, but it is representing a significant quick change in the function, around a critical mass level. We construct a Binomial Logistic Regression Model, in order to estimate the probability measure of the success of a project, based on the given resources leading to the success of this project, as follows. MODEL ASSUMPTION: Y = β N + β O X O + β X + + β S X S + Where X T is the value of a resource j that could be related to the success of the project, for j=1,2,,k. is an independent random variable, with zero mean and constant variance. Also, Y ln (odds ratio). Here, ln denotes the natural logarithm function, and its inverse function is the exponential function, denoted by e or exp. ODDS RATIO provides information about the degree of likelihood for the success event on a Bernouilli Trial. Assume we conducted the trial several times (at least once), then we make a tally of the number of successes, denoted by e O ; and number of failures, denoted by e in those trials, such that the total number of repetitions of the experiment is n = e O + e, where n > 0. In this case, we say that the ODDS ARE e O TO e, to inform that we have seen e O number of successes and e number of failures in those n number of trials. From this information, we can construct a representation of the degree of chance for the successes relative to the failures, by using the ODDS RATIO, defined as the ratio of e O to e, or, ODDS RATIO = b c b d. Similarly, another representation of chance may be constructed, by using the statistical probability measure for the success and failure events, based on our observations, such that: P(SUCCESS) = b c f and P(FAILURE) = b d f. Here, P () denotes the probability measure of an event defined as the relative frequency of this event in the data set. As n gets larger and larger, and goes to infinity, the statistical probability measure will converge to the probability measure of the event. 3 P age

4 Now, if we look at the ratio of the P(SUCCESS) to P(FAILURE), we will observe that: g(hieejhh) g(k#giifj) = b cl f b d l f. Since n > 0, an integer number, we have: g(hieejhh) g(k#giifj) = b c b d. By observing equivalence of these two different representations, we say that ODDS RATIO = g(hieejhh) g(k#giifj). On the other hand, since SUCCESS and FAILURE are COMPLEMENT EVENTS, we have P(FAILURE) = 1 P(SUCCESS). Therefore, we have a relationship between the odds ratio and the P(SUCCESS); ODDS RATIO = g(hieejhh) Oqg(hiEEjhh). If we know the odds ratio, then we can find the P(SUCCESS), from this information: Let ODDS RATIO be noted by y and P(SUCCESS) be denoted by x. Thus, we have y = s, and solve for the unknown x, by using the known y. Oqs y yx = x y = x + xy y = x(1 + y) u Ovu = x, or x = u Ovu. Therefore, P(SUCCESS) = }}h F#HG, Ov }}h F#HG is the equation to compute the probability measure, based on the ODDS RATIO information. Now, we collect data, and generate a Binomial Logistic Regression Equation: (1) Y = b N + b O X O + b X + + b S X S. Assume we have n number of observations. When we introduce the values for X T ; j=1,2,,k; This equation would yield an estimated Y, denoted by Y. (2) Since Y = ln (ODDSRATIO), We can filter Y through the inverse function of the natural logarithm function, and eliminate the work done by the natural logarithm function on the estimated odds ratio, such that the estimated odds ration: (3) ODDSRATIO = exp (ln (ODDSRATIO)). (4) Now we can compute the estimated probability of the success event: (5) P(SUCCESS) = }}h F#HG Ov }}h F#HG. 4 P age

5 Therefore, we achieved what we want, i.e. we estimated the probability measure of the success of a project, based on the knowledge of all available resources, may be related to the success of the project. Hence, we must investigate the data set for evidence of an agreement between the data set and the model. I. GLOBAL INVESTIGATION, DEVIANCE TEST: H N : There is no significant difference between the model predictions and reality. H O : There is a significant difference between the model predictions and reality. Level of significance: Deviance is a function of the number of predictors used in the model and the maximum of the logarithm of the likelihood function, generated by data. Test statistic: DEVIANCE ~ χ fqsqo = DEVIANCE). (χ hh#h χ EFGHGE#I = χ ;fqsqo. Decision Rule: If DEVIANCE > χ EFGHGE#I, then reject H N. Decision: Case-A We cannot reject H N at level of significance. That means that we are (1- )% confident that we can use our logistic regression equation, in order to estimate the probability of success, for a project. Decision: Case-B We reject H N at level of significance. That indicates that we cannot use our logistic regression equation to predict the probability of success. If we cannot reject H N, then we identify which predictors are necessary in the regression equation. 5 P age

6 II. Individual investigation for each predictor: Wald Test for β T ; j=1,2,3,,k. H N : β T = 0 H O : β T 0 Level of significance:, (same used in the Global test) Test statistic: Ƶ hh#h = q ˆ h Ƶ hh#h ~ Ɲ (0,1); i.e. a Ƶ-variate, or a standard normal random variable. Decision Rule: If Ƶ hh#h > Ƶ EFGHGE#I, then reject H N. Where Ƶ EFGHGE#I = Ƶ (Oq )the point that generates ( ) amount of right tail are under d the standard normal probability density function. Decision: Case-A We cannot reject H N at level of significance. This implies that there is no need to use the X T information, in order to predict the probability of success. Decision: Case-B We reject H N at level of significance. That implies that we need to know X T, in order to predict the probability of success. 6 P age

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

Statistical Methods in Business Lecture 5. Linear Regression We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.