Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Size: px

Start display at page:

Download "Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression"

Jeremy Thompson
5 years ago
Views:

Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity for Polytoous and Logistic Regression John E.

Research Publishing Inc. This work is licensed under the Creative Coons Attribution International License (CC BY). http://creativecoons.org/licenses/by/4.

1 Advances in Pure Matheatics, 206, 6, Published Online April 206 in SciRes. Inference in the Presence of Likelihood Monotonicity for Polytoous and Logistic Regression John E. olassa Departent of Statistics and Biostatistics, Rutgers University, Piscataway, NJ, USA Received 2 Noveber 205; accepted 9 April 206; published 22 April 206 Copyright 206 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Coons Attribution International License (CC BY). Abstract This paper addresses the proble of inference for a ultinoial regression odel in the presence of likelihood onotonicity. This paper proposes translating the ultinoial regression proble into a conditional logistic regression proble, using existing techniques to reduce this conditional logistic regression proble to one with fewer observations and fewer covariates, such that probabilities for the canonical sufficient statistic of interest, conditional on reaining sufficient statistics, are identical, and translating this conditional logistic regression proble back to the ultinoial regression setting. This reduced ultinoial regression proble does not exhibit onotonicity of its likelihood, and so conventional asyptotic techniques can be used. eywords Polytoous Regression, Likelihood Monotonicity, Saddlepoint Approxiation. Introduction We consider the proble of inference for a ultinoial regression odel. The sapling distribution of responses for this odel, and, in turn, its likelihood, ay be represented exactly by a certain conditional binary regression odel. Soe binary regression odels and response variable patterns give rise to likelihood functions that do not have a finite axiizer; instead, there exist one or ore contrasts of the paraeters such that as this contrast is increased to infinity, the likelihood continues to increase. For these odels and response patterns, axiu likelihood estiators for regression paraeters do not exist in the conventional sense, and so onotonicity in the likelihood coplicates estiation and testing of binary regression paraeters. Because of the association be- How to cite this paper: olassa, J.E. (206) Inference in the Presence of Likelihood Monotonicity for Polytoous and Logistic Regression. Advances in Pure Matheatics, 6,

2 tween binary regression and ultinoial regression, ultinoial regression ethods inherit this difficulty. In particular, ethods like those suggested by [], using higher-order asyptotic probability approxiations like those of [2], are unavailable in these cases, since the ethods of [2] use values of the axiized likelihood, both with the paraeter of interested fixed and allowed to vary, and use the second derivatives of the likelihood at these two points. [3] provides a ethod for diagnosing and adjusting for likelihood onotonicity for conditional testing in binary regression odels. This anuscript extends this ethod to facilitate estiation in ultinoial regression, for approxiate inference, and in particular akes practical the use of the approxiation of [2]. Section 2. reviews binary and ultinoial regression odels, and relations between these odels that let one swap back and forth between the. Section 2.2 reviews conditional inference for canonical exponential failies. Section 2.3 reviews techniques of [3] for perforing conditional inference in the presence of likelihood onotonicity for binary regression, akes a suggestion for iproving the efficiency of this earlier technique, and expands on its iplication for estiation. Section 2.4 reviews soe existing techniques for addressing likelihood onotonicity. Section 2.5 develops new techniques for detection of likelihood onotonicity in ultinoial regression odels, and explores discusses non-uniqueness of axiu likelihood estiates in this case. Section 3 applies the techniques of Section 2.5 to soe exaples. Section 4 presents soe conclusions. 2. Methods and Materials This section describes existing ethods used in cases of likelihood onotonicity in ultinoial odels, and presents new ethods for addressing these challenges. 2.. Multinoial and Logistic Regression Models Methods will be developed in this anuscript to address both ultinoial and binary regression odels. In this section, relationships between these odels are ade explicit. Consider first the ultinoial distribution. Suppose that M ultinoial trials are observed; for trial {,, M}, one of J alternatives is observed, with alternative j {,, J} having probability ( x ) ( x ) p = n exp β n exp β. () j j j k k k = Here x R j (for R representing the real nubers and a positive integer) are covariate vectors associated with each of the alternatives, and n j are the nuber of replicates with this covariate pattern. These probabilities depend only on on the differences between x j and x j for j j ; without loss of generality we will take xj = ( 0,,0), treating the last category as a baseline. Let { W,,, } j = M be independent rando variables such that P [ W = j] = pj, and let Y j = if W = j and Y j = 0 if W j. Then the variables W are the indices of the selected ultinoial outcoes, and Y j are related indicator variables. The likelihood is given by Let J M J yj j = j = ( ) = j = j= P Y y, j L β p. (2) β X be the atrix with J rows and coluns, with row j given by and let (,, X = X XM ) and (,, Y = Y YM ) The binary regression odel is siilar; let { V M} with ass function for. Then sufficient statistics for β are x j, let = ( Y YJ ) Y,,, S = X Y. (3),,, = be independent binoial rando variables M n v n v P[ V = v] = ( β ) = q ( q) (4) β = v 332

3 where z R exp z q = + exp ( β ) ( z β ), and a positive integer. Sufficient statistics for β are given by, T= ZV (6) for Z the M atrix with row equal to z. The binary regression odel can be recast as a ultinoial regression odel. Furtherore, the ultinoial regression odel ay be expressed as a conditional binary regression odel. Suppose that () and (2) hold. Let E be a atrix with J rows and M coluns, such that the entries in colun are all, and all other entries are 0. Let Y = ( Y YJ ),, (5) with v defined analogously, and let X E Y Z =, and V =, (7) X E YM T= ZV and t = Z v Conditional Inference The odel for T given by (4)-(5) represents a canonical exponential faily, and so inference on soe coponents of β (without loss of generality, β,, βi ) ay be perfored by considering the sapling distribution of T,, TI conditional on T,, I + T. Let Qβ,I ( t ) represent the probability ass function for this conditional distribution. The probability ass function for S of (3), evaluated at s, is exactly the sae as,,, M for = ( n n ) ( t ) = P [ = = = = ] Q T t,, T t T t,, T t, β β I I I+ I+ t s. Note here the conditioning event for the larger odel is expressed in ters of saple sizes in the saller odel. Furtherore, one can relate probabilities calculated with the regression paraeters set to zero, to the probabilities for a general paraeter vector β, by Q β, I ( t ) = ( t ) j j 0, I ( t,, t ) j= I exp tjβ j Q0, I j= I exp t β Q I When I =, define the confidence interval for one of the regression paraeters (without loss of generality β ) with noinal coverage α α 0,, by Then ( T, α ) fact, the coverage P ( T ) for ( ) α α ( t, α) = β Qβ ( st, ) ( ) 2,, t, Qβ st, 2,, t. (9) s t 2 2 s t has coverage probability at least α, and can be used as a α confidence interval. In, β β α ay be strictly greater than α, and ore precise intervals with at least α T α, for coverage ay be constructed as (, ) ( t ) ( T ) in,. * α = α P β α α β β. (8) The cuulative probabilities iplicit in (9) ay be approxiated as Qβ, ( st, 2,, t ) Φ ( ω) + φ( ω)( ζ ω), () s t for Φ and φ the standard noral distribution function respectively, ω = 2( ( ˆ ) ( )) (0) β the β β, for ( ) 333

4 logarith of the likelihood (in the ultinoial regression case, given by (2)), ˆ β, with data adjusted for continuity so that t is reduced by half, β axiizing ( β ), with β fixed at its null value, and ζ = ( ˆ β β) det ( ( ˆ )) det ( ( )) β axiizing ( ) β β, with the atrix of second derivatives of with respect to all but the first coponent of the arguent [2]. Siilar techniques ay be applied to the ultinoial regression odel, with S of (3) replacing T of (6). Denote the conditional saple space by 2.3. Infinite Estiates M M = zjv j I zjv = sj j > I. = = (2) This section reviews and clarifies techniques for inference in the presence of onotonicity in the logistic regression likelihood (4) given by [3], who built on results of [4]-[6], and [7]. Choose λ R. Let { u ( λ ua) } = R li + > 0, a for the logistic regression likelihood of (4) [3], building on the results of [4], deterines which observations correspond to extree fitted probabilities by axiizing the total nuber of positive entries in both ρ and σ subject to constraints outlined in the next theore. Theore. Suppose that rando vectors V arise fro the odel (4) and (5), and atrix Z is of full rank. Then 0 is exactly the set of vectors with ρ and σ colun vectors such that T T ( ) ( ) u= Z Z Z σ ρ, (3) σ ρ and such that σ 0, ρ 0, (4) j j j ( ) ( ) ( ) T T T T t Z Z Z n Z σ ρ σ = 0, and σ ρ = 0, (5) T where T= ZV are the sufficient statistics associated with β, t is the observed value of this vector T, and Z is a atrix with M rows and rank M, such that Z Z = 0. Furtherore, the conditional probabilities are the sae as those arising if observations with positive entries in σ or ρ are oitted, and collinear covariates aong coluns I +,, reoved. The atrix Z ay be constructed fro the QR decoposition of Z. Inference on coponents of β for (4) ay be perfored conditionally, even when axiu likelihood estiates fail to exist, and hence () cannot be used [3] Other Approaches Suppose that there exists a vector u such that u x 0 if Y =, u x 0 if Y = 0, (6) j j j j with strict inequality holding in place of at least one of the inequalities. Then the likelihood L( λ + au), defined in (2), is a strictly increasing function of a for any λ R, and so no finite axiizer of L exists. This lack of a finite axiizer leads to difficulties with axiu likelihood estiation and inference. This section reviews soe existing approaches. Bias-correction is possible for axiu likelihood estiators [8], and ay be eployed in this type of cituation [9]. Estiates with this correction applied are the sae as those axiizing the posterior density of the paraeters under the invariant prior of [0], in which the likelihood function is ultiplied by the square root of the deterinant of the inforation atrix; this is equivalent to axiizing a penalized likelihood. That is, if ( β) = log ( L( β )), for ( ) ( ) = ( ) 2 det ( ). L β as in (2), then the approach of [8] suggests axiizing β β β These estiates are always finite. A siilar approach is possible in the case of proportional hazards regression, which also gives advantages for testing []. 334

5 Standard errors ay be calculated fro the second derivative of the unpenalized log likelihood [8]; one could also calculate standard errors fro the second derivative of the penalized log likelihood []. This second approach is used in this anuscript for coparison purposes, and so the asyptotic confidence intervals considered below for β is β ± = ( β) β ( ( β) ) z α 2 for argax. Here the superscript on represents coponent of the inverse second derivative atrix of with respect to β. The approach using Jeffreys prior to penalize the likelihood has soe disadvantages. The union of all possible confidence intervals resulting as the penalized estiator plus or inus a ultiple of the standard error has a finite range, and so the confidence region procedure as described above has vanishing coverage probability for large values of the regression paraeter Estiation We investigate the behavior of axiu likelihood estiates in the ultinoial regression odel (2). Maxiizers for both the original likelihood and for the likelihood of the distribution of sufficient statistics of interest (6) or (3) conditional on the reaining canonical sufficient statistics are considered. Conditional probabilities arising fro the logistic regression odel (4) are of for (2), and so ay also be handled as below. Consider the occurrence of infinite estiates for odel (2). Denote the saple space for the S of (3) by. For a canonical exponential faily with finite support, the axiizer of the likelihood associated with a data point s exists if and only if s hull( ), where ( ) hull represents the convex hull of its arguent, and the superscript represents the interior of the set it odifies ([2], Theore 9.4). The following two corollaries ay be used to deterine if the finite axiizers of (2) exist, in ters of the observed sufficient statistic s of (3). The first is a corollary to ([2], Theore 9.4). Corollary. Unique finite axiizers of the likelihood given in (2) exist if and only if X is of rank at least, and the axiizer of c subject to M J J j j j = j= j= { M} (7) s = α x, α =,,. (8) j {,, }, {,, } α c M j J (9) is greater than zero. Proof. If X is not of full rank, then ultiple paraeter values give the sae value of X β, and axiizers cannot be unique. Suppose that X is of full rank. The set is defined fro (3) as the set of ultiples of rows of X, with the ultipliers taken fro { 0, }, with the su of indicators associated with a fixed equal to. Let = { s ( 8) and ( 9) hold, c 0}, and let + = { s ( 8) and ( 9) hold, c > 0}. The convex hull of is given by. By the theore of [2], a finite estiator at the data point s exists if and only if s hull( ). In this case, the interior is defined in ters of the topology of the subset of R containing s subject to (8). The proof is coplete when one shows that + = hull( ). Suppose that s hull( ). Let α achieve the axiu as described in the stateent of the corrolary. Choose and j such that α j is as sall as any coponent of α. There exists j j, j J, such that α j J. Let γ be the vector configured like α, consisting of all zeros except for in the, j place and - in the, j place. There exists > 0 such that s Xγ. Hence α ay be chosen so that all coponents of α γ are non-negative, and, in particular, + αj > 0. Hence s. + Now take s, let α and c be as defined in (9), and choose any vector v. Since X is of full rank, v is expressible as = J can be selected such that v Xγ, for = ( γ,, γj,, γ ) M,, γmj M γ j= j = for all. Let = c ( 2 ax γ j ) β. Since J = ( 0,,0). Then s v hull( ) x for all, γ. satisfy- One ay deterine whether such a c exists by axiizing c over non-negative c and α,, α MJ M 335

6 ing (8) and (9), and checking to see if the axiu is greater than zero. The above axiization ay be done via the siplex algorith. If s, then (8) and (9) are satisfied by the vector α with zeros in each coponent except that corresponding to s ; this realization akes optiization of c ore efficient. If the optiization indicates that such a positive c exists, the axiizer of (2) ay be deterined using Fisher scoring. The second corollary follows directly fro Theore. Corollary 2. Suppose that the rando vector Y arises fro the ultinoial regression odel () and (2). Use (7) to construct the iplied conditional logistic regression odel, and Theore to reduce the odel to one with finite axiu likelihood estiates. Then standard asyptotic ethods for conditional inference on odel paraeters of interest, including noral theory techniques and those of [], can be used for testing and estiation. If either Corollary or Corollary 2 indicates that finite axiu likelihood estiators do not exist, one ight look for estiators in the extended real nubers R = {, } R. In certain highly--structured cases, for exaple, when the saple space for certain stratified rank--based tests is ebedded in a canonical exponential faily [3], unique axiizers in R can be show to exist. In general, unique estiators are known to exist in the extended real nuber only in the case when =, since the profile likelihood obtained by setting one of the paraeters to ± will correspond to the likelihood of a siilar odel, with one paraeter, and one or ore observations, deleted. This reduced regression odel need not be of full rank, and so the profile likelihood need not have a unique axiizer. The second exaple, in Section 3, involves a saple space containing points for which (conditional) axiu likelihood estiates cannot be extended unabiguously, even if allowing infinite values for soe coponents. [9] otivates the penalized likelihood approach for estiation in order to reduce biases of estiators. Since the standard approach to estiation in this case allows for infinite estiators, expectations of the conventional estiators do not exist, and so bias is an inappropriate criterion for our estiator. In what follows, the edian bias (that is, the difference between the edian of the sapling distribution of the estiator, inus the true value of the paraeter) is used to assess quality of estiation. 3. Results The following date reflects the results of a randoized clinical trial testing the effectiveness of a screening procedure designed to reduce hepatitis transission in blood transfusions [9] [4]. The clinical trial was divided into two tie periods. The data ay be suarized as in Table [9]. Hepatitis outcoe is odeled as a function of period ( S = 0 for early, and S = for late), screening treatent ( T = 0 for standard, and T = for the new ethod), and the interaction between period and treatent ( S T ). The odel also contains an intercept ter (I). We treat the response as unordered categories. A log linear odel is used, iplying a coparison between each of the two hepatitis categories and the third no-disease baseline category. Each of these coparisons involves paraeters for I, S, T, and S T, for a total of 8 paraeters. This odel is saturated, in that there are as any paraeters as there are potential table probabilities. In our case, M = 4588, J = 3 for all, and q = 8. The lack of any Hepatitis C cases aong the treated individuals in the early period gives rise to infinite estiate for the T and S T paraeters in the coparison between Hepatitis C and healthy subjects. In this siple case, closed-for axiizers of (2) under the alternative hypothesis exists, since the alternative hypothesis ay be viewed as a saturated odel for a table. No closed-for axiizers of (2) exist under the null hypothesis, since this odel is equivalent to the odel with no three-way interactions. The sapling distribution of the sufficient statistics is available in under the hypothesis that all six S, T, and Table. Hepatitis data. Group Hepatitis Outcoe: Response Variable C Non-ABC No disease Tie 0 Treated Tie 0 Untreated Tie Treated Tie Untreated

7 S T coefficients are zero. Exact enueration techniques ay be used to enuerate all possible tables 4 3 tables, and their probabilities [5], under the assuption that treatent and tie cobinations are independent of disease status. There are 2,046,240 such tables. Inference on the two S T paraeters ay be perfored by conditioning the distribution of the associated sufficient statistics for the interaction ters on sufficient statistics associated with the T, S, and I paraeters. This conditioning is equivalent to conditioning on Hepatitis C and non-abc Hepatitis totals for each treatent group, and each tie period, separately. After perforing this conditioning, 24 tables reain, with 6 distinct values for the effect of the S T interaction on Hepatitis C, and 4 distinct values for the effect of the S T interaction on non-abc hepatitis. This yields six tables for inference on the interaction effect on Hepatitis C, and four for inference on the interaction effect on non-abc hepatitis. Figure shows the edian bias for estiation of the effect on Hepatitis C, indicating a sall advantage for the uncorrected estiates. Figure 2 shows coverage of penalized likelihood asyptotic confidence intervals (7), and exact intervals (9), using (0). In this case, exact intervals are readily available, and have far better coverage properties than the asyptotic intervals; in particular, note that the asyptotic intervals have zero coverage for sufficiently large absolute values of the paraeter of interest. Values presented in Figure are suarized in Table 2, and values presented in Figure are suarized in Table 3. The range presented in Table 2 are heavily dependent on the range of paraeter values exained in Figure, and are intended only for coparison aong the two ethods. Figure. Median bias. Figure 2. Coverage for various two-sided confidence interval procedures. 337

8 A second exaple concerns polling data related to British general elections [6]. The data set investigated here represents a subset of voters in one of eight geographic areas (London North, London South, Greater Manchester, Merseyside, South Yorkshire, Tyne and Wear, West Midlands, and West Yorkshire), and includes survey respondents who provided their ages and an inforative response to a question easuring age at which education was finished (coded as = 5 or younger, 2 = 6, 3 = 7, 4 = 8, 5 = 9 or older), and reported a party of preference, and party of intended vote in the 2005 election. Three parties (Conservative, Labor, and Liberal Deocrat) were reported as answers to these last two questions. The resulting data set included 67 individuals. We odel voting choice as a function of usual party preference and education, treated as an ordinal variable. In this case, = 8, with covariates representing the effects of Labor and Liberal Deocratic Party ebership, and education, on the propensity to vote for Labor, and the sae effects on the propensity to vote for the Liberal Deocrats, plus intercept ters for Labor and Liberal Deocrats. Mebership in the Conservative Party, and choice of the Conservative Candidate, are taken as baseline. We explore the effect of education on the propensity to vote for the Labor candidates. The null distribution of this data set cannot be trivially expressed as the independence distribution for a contingency table. One ight enuerate the conditional saple space for the sufficient statistics vectors of (3), and the associated conditional probabilities [7]; this calculation, however, took over 6 hours to coplete when coded in FORTRAN 90 and run on a 2.6 GHz processor with a GB cache and 8 GB of eory, and is too intensive for routine use. This condition distribution is tabulated in Table 4. Figure 3 shows the edian bias for estiates [8] and the proposed ethod, explicitly recognizing infinite estiates associated with the extree points in the conditional saple space. While the edian bias for both estiators is poor, the corrected estiator generally perfored better than the uncorrected estiator, as reported by other authors. This anuscript is priarily concerned with producing confidence intervals. One ight use the asyptotic intervals calculated using the penalized likelihood (7), or (9), with probabilities calculated exactly using Table 4 in conjunction with (8) and the noinal level adjusted using (0), or (9), with cuulative probabilities approxiated (in the present case using a double saddlepoint conditional distribution function approxiation of [2]). Table 5 shows the resulting confidence intervals, and Figure 4 shows the coverage of these intervals. As Table 2. Median bias in two estiation ethods for hepatitis data. Maxiized Posterior Conditional MLE Miniu Maxiu Table 3. Minial coverage for two confidence interval ethods for the hepatitis data. Asyptotic, based on posterior Exact Miniu Table 4. Probabilities Q ( t ) 0, of sufficient statistic associating education with labor party voting, conditional on sufficient statistics for other variables in odel. Sufficient Statistic Nuber of Corresponding Conditional Value Response Vectors Probability , , , , , ,

9 Table 5. Two-sided confidence intervals for the effect of education on voting for labor candidate. Sufficient Interval Type Statistic Value (7) Exact Saddlepoint 73 (.934, 0.8) (, 0.086) (, 0.242) 74 (.508, 0.34) ( 2.996, 0.052) ( 8.668, 0.22) 75 (.264, 0.204) (.478, 0.068) ( 3.420, 0.400) 76 (.092, 0.296) (.436, 0.30) ( 2.246, 0.72) 77 ( 0.958, 0.408) (.80, 0.674) (.624,.8) 78 ( 0.848, 0.548) ( 0.784, 0.86) (.26,.72) 79 ( 0.760, 0.728) ( 0.666,.00) ( 0.90, 2.850) 80 ( 0.694, 0.990) ( 0.66, 3.056) ( 0.632, 8.042) 8 ( 0.696,.492) ( 0.406, ) ( 0.26, ) Figure 3. Median bias. Figure 4. Coverage for various two-sided confidence interval procedures. noted above, coverage for the asyptotic interval is zero for β below the lowest possible confidence interval endpoint, and above the highest possible confidence interval endpoint; since each of these intervals is finite, and since there are only a finite nuber of these intervals, coverage is zero outside of a bounded interval for the penalized likelihood intervals (7). Coverage for the asyptotic saddlepoint interval is ostly quite good, except for one area in the iddle. This poor perforance near β = 0 can be attributed to the fact that for t = 73, the saddlepoint confidence interval is shifted to the left relative to the exact interval, and so the saddlepoint interval 339

10 Figure 5. Conditional saple space for education effects. fails to cover soe values of β that the exact interval covers. See the first row in Table 5. Furtherore, this ost extree value of t has a considerable aount of conditional probability attached to it; see Table 4. In practice, Table 4, and hence the exact intervals in Table 5 and Figure 4, are not coputationally feasible. The saddlepoint confidence intervals are coputationally feasible, but when the entire sufficient statistic vector lies on the boundary of the convex hull of the sufficient statistic saple space, the ethods of Corollary 2 are required to apply these ethods. One can also consider siultaneous inference on both education paraeters. Figure 5 displays the conditional saple space for these the sufficient statistic vectors for the effects on education on preference for Labor and Liberal Deocratic Candidates. Corollary indicates that the observed sufficient statistic vector (indicated by in the figure) corresponds to finite estiates. Note that the point indicated by Δ is an exaple of one corresponding to abiguous estiates; the paraeter associated with the first coponent is clearly estiated as. The conditional likelihood associated with this saple space, with the first coponent of the paraeter set to, corresponds to the sapling distribution consisting of only the point Δ; this space is degenerate, and any value for the second paraeter is equally preferred. The point at the opposite corner of the parallelogra in Figure 5 exhibits siilar behavior. Values presented in Figure 4 are suarized in Table 2, and values presented in Figure 5 are suarized in Table 3. The range presented in Table 2 are heavily dependent on the range of paraeter values exained in Figure 4, and are intended only for coparison aong the two ethods. 4. Conclusion This paper presents an algorith for converting a ultinoial regression proble that features nuisance paraeters estiated at infinity to a siilar proble in which all nuisance paraeters have finite estiates; this conversion is such that the distribution of a sufficient statistic associated with the paraeter of interest, conditional on all other sufficient statistics, reains unchanged. These conditional probabilities in the reduced odel ay be approxiated using standard asyptotic techniques to yield confidence intervals with coverage behavior superior to those that arise fro, for exaple, asyptotics derived fro the likelihood after penalizing using Jeffreys prior. Acknowledgeents This research was supported in part by NSF grant DMS References [] Davison, A.C. (988) Approxiate Conditional Inference in Generalized Linear Models. Journal of the Royal Statistical Society, 50,

11 [2] Skovgaard, I. (987) Saddlepoint Expansions for Conditional Distributions. Journal of Applied Probability, 24, [3] olassa, J.E. (997) Infinite Paraeter Estiates in Logistic Regression, with Application to Approxiate Conditional Inference. Scandinavian Journal of Statistics, 24, [4] Albert, A. and Anderson, J.A. (984) On the Existence of Maxiu Likelihood Estiates in Logistic Regression Models. Bioetrika, 7, [5] Jacobsen, M. (989) Existence and Unicity of Miles in Discrete Exponential Faily Distributions. Scandinavian Journal of Statistics, 6, [6] Clarkson, D.B. and Jennrich, R.I. (99) Coputing Extended Maxiu Likelihood Estiates for Linear Paraeter Models. Journal of the Royal Statistical Society, 53, [7] Santner, T.J. and Duffy, D.E. (986) A Note on A. Albert and J. A. Anderson s Conditions for the Existence of Maxiu Likelihood Estiates in Logistic Regression Models. Bioetrika, 73, [8] Firth, D. (993) Bias Reduction of Maxiu Likelihood Estiates. Bioetrika, 80, [9] Bull, S.B., Mak, C. and Greenwood, C.M. (2002) A Modified Score Function Estiator for Multinoial Logistic Regression in Sall Saples. Coputational Statistics and Data Analysis, 39, [0] Jeffreys, H. (96) Theory of Probability. 3rd Edition, Clarendon Press, Oxford. [] Heinze, G. and Scheper, M. (200) A Solution to the Proble of Monotone Likelihood in Cox Regression. Bioetrics, 57, [2] Barndorff-Nielsen, O.E. (978) Inforation and Exponential Failies in Statistical Theory. Wiley Series in Probability and Matheatical Statistics. Wiley, New York. [3] Robinson, J. and Saonenko, I. (202) Personal Counication. [4] Blajchan, M., Bull, S. and Feinan, S., C.P.-T.H.P.S. (995) Group Post-Transfusion Hepatitis: Ipact of Non-a, Non-b Hepatitis Surrogate Tests. The Lancet, 345, [5] Pagano, M. and Halvorsen,.T. (98) An Algorith for Finding the Exact Significance Levels of r c Contingency Tables. Journal of the Aerican Statistical Association, 76, [6] Sanders, D.J., Whiteley, P.F., Clarke, H.D., Stewart, M. and Winters,. (2007) The British Election Study. University of Essex. [7] Hirji,.F. (992) Coputing Exact Distributions for Polytoous Response Data. Journal of the Aerican Statistical Association, 87,

Biostatistics Department Technical Report

Biostatistics Department Technical Report Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent