Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with Structural Zero

Size: px
Start display at page:

Download "Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with Structural Zero"

Transcription

1 Melbourne Business chool From the electedworks of Chris J. Lloyd ummer 2007 Efficient and Exact Tests of the Risk Ratio in a Correlated 2x2 Table with tructural Zero Chris Lloyd Available at:

2 Computational tatistics & Data Analysis 51 (2007) Efficient and exact tests of the risk ratio in a correlated 2 2 table with structural zero Abstract Chris J. Lloyd Melbourne Business chool, Carlton 3053, Australia Received 7 July 2006; received in revised form 19 December 2006; accepted 19 December 2006 Available online 30 December 2006 For a correlated 2 2 table where the (01) cell is empty by design, the parameter of interest is typically the ratio of the probability of secondary response conditional on primary response to the probability of primary response, also known as a risk ratio. It is common to test whether or not the risk ratio equals one. One method of obtaining an exact P -value is to maximise the tail probability of the test statistic over the nuisance parameter. It is argued that better results are obtained by first replacing the nuisance parameter by its profile estimate in the calculation of its exact significance followed by maximisation termed an E + M P -value. We consider four standard approximate test statistics with and without the common correction of adding 2 1 to each count. From a complete enumeration of the distributions of these P -values (for sample sizes 50 and 100), we recommend E + M P -values based on the uncorrected Wald statistic for testing the greater than alternative and on the corrected Wald statistic on the log-scale for testing the less than alternative. A good compromise statistic for both kinds of alternatives is the likelihood ratio statistic Elsevier B.V. All rights reserved. Keywords: Nuisance parameters; Exact test; Correlated proportions; Discordant pairs; Maximised P -value 1. Introduction A sample of n individuals have a binary response measured. For reasons of design, only those who give a certain response on the first occasion are measured a second time. uch designs arise, for instance, in both treating and testing for disease, see Johnson and May (1995). An often quoted example is Toyota et al. (1999) who study the detection rates of a screening test for tuberculosis. For those who test negative on the first occasion the test is applied a second time 1 3 weeks later, whereas those who test positive on the first occasion do not need to be retested. It is suspected that application of the first test, even if negative, makes infected individuals more sensitive to subsequent tests. This booster phenomenon can be measured by the extent to which the probability of a negative response decreases from the first to the second occasion, given the first. Another example which we will study in some detail was given in Agresti (1990, p. 45) (Table 1). A sample of 156 calves were tested for pneumonia during the first 60 days of life and a total of T = 93 were positive. Of these 93 calves with primary infection, n 11 = 30 suffered a secondary infection in the following two weeks. There was interest address: c.lloyd@mbs.edu /$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:116/j.csda

3 3766 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Table 1 Example of Agresti Primary econdary Total Yes No Yes No Total in comparing the rate of primary infection, estimated to be 93/156 = 59.6%, with the rate of secondary infection, estimated to be 30/93 = 32.3%. The ratio of these two probabilities, known as the risk ratio (RR), represents the factor by which chance of infection changes after first infection, and is here estimated to be /93 2 = An RR less than 1.0 suggests that primary infection has an immunising effect. Agresti defines a kind of χ 2 statistic for testing whether the RR = 1. For this example, the value turns out to be The signed version of this statistic is 4.44 and the approximate one-sided P -value is Certainly the evidence for a protective effect of first infection seems to be overwhelming. Liu (1998, 2000) studied confidence intervals for the ratio and difference in response probabilities, respectively. This work was further developed by Tang and Tang (2002) for the ratio and Tang and Tang (2003) for the difference of probabilities. Lloyd and Moldovan (2007a,b) have recently applied the exact method of Buehler (1957) to confidence limits for the RR. There has been less work on the testing problem though obviously the confidence intervals can be used to define two-sided tests. This paper is motivated by several considerations. First, in this problem it is quite computationally feasible to calculate a P -value with exact statistical properties. This is achieved by maximising over the nuisance parameter. Within the frequentist paradigm of inference it is essential to account for the worst possible parameter values if the statistical properties are to be guaranteed. While this may seem conservative, maximisation is the most efficient method possible of achieving this guarantee. Tests which are not maximised over the nuisance parameter are either systematically conservative or explicitly violate their stated properties, as explained in Lloyd (2005). econd, standard asymptotic tests have statistical properties that are far from ideal, even for large samples. The issue of exactness is a practical one. For instance in the above example of Agresti, the exact P -value obtained by maximising over the nuisance parameter is 361 which, while still small, corresponds to an equivalent Z-statistic of 2.91 rather than uch behaviour is not at all uncommon. Third, such behaviour can be largely eliminated by replacing the nuisance parameter with a null estimate and then maximising, as described in ection 3. This results in a P -value that is less sensitive to the nuisance parameter and consequently the maximised versions tend to be smaller. This will be seen to translate into superior for guaranteed size. Lastly, we look at the performance when RR > θ 0 and RR < θ 0 separately and discover quite different behaviour. 2. Model notation and approximate test statistics The possible responses of an individual are {11, 10, 00}, where 00 denotes a negative response on occasion 1, in which case the second response is negative by convention. Let n ij be the number of individuals with response ij and p ij the probability of this response. The count n 01 =0 is absent by design. The probability of a positive response on the first occasion is = p 11 + p 10. The probability of a second positive response given a first positive response is p 11 /. The ratio of these two probabilities is p 11 / 2. This RR is the parameter of interest and is denoted by θ. The parametrisation of the model in terms of (θ, ) is summarised in Table 2. In order for all these probabilities to be less than 1, there is a restriction min(1, θ 1 ) so the parameter space is Ω ={0 1, θ < 1 }. The choice of nuisance parameter has no effect on the approximate or exact test statistics to be defined below. The data (n 11,n 10,n 00 ) are multinomial but it is convenient to take the data to be (X,T,n), where T = n 11 + n 10 is the number that tests positive on occasion one and X = n 11 the number of these that tests positive a second time. Denoting binomial probability with parameters (n, p) by B(x; n, p) the joint distribution of (X, T ) is Pr(X = x,t = t) = Pr(T = t)pr(x = x T = t) = B(t; n, )B(x; t,θ),

4 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Table 2 Parametrisation of 2 2 table with structural zero First response econd response Total + + p 11 = θ 2 p 10 = (1 θ) p 00 = 1 1 and so the log-likelihood function is l(θ, x,t) = x log θ + (t + x)log + (n t)log(1 ) + (t x)log(1 θ). (1) We now describe four standard test statistics that may be used to test the null hypothesis θ = θ 0. While these statistics are asymptotically equivalent, they can behave quite differently for moderate samples Wald-type statistics: The maximum likelihood (ML) estimator ˆθ =ˆp 11 / ˆ 2, where ˆp 11 = x/n and ˆ = t/n are empirical estimators. The asymptotic variance of ˆθ = nx/t 2 is p 11 (1 p 11 )/(n 4 ) as given in Lui (1998) who went on to suggest two Wald-type test statistics for testing θ = θ 0 : W 1 = ˆθ θ 0 E(ˆθ) = xn θ 0t 2 nx(n x), W 2 = log ˆθ log θ 0 E(log ˆθ) 2.2. LR and score statistics: nx = (log(xn) 2 log t log θ 0 ) n x. The likelihood ratio (LR) and score tests require the restricted ML estimator ˆ 0 of for fixed θ = θ 0, which is obtained by solving a quadratic equation. In the special case θ 0 = 1 which is of primary interest in this paper, the solutions are ˆ 0 = (x + t)/(n + t) and 1. The first smaller solution corresponds to the maximum and is always within the range (0, min(1, θ0 1 )), see Appendix A. The signed root LR ratio statistic is = sign(ˆθ θ 0 ) 2(l(ˆθ, ˆ) l(θ 0, ˆ 0 )). The score statistic can be shown to be given by = n( ˆp 11 ˆ ˆ 0 θ 0 ) 1 + ˆ 0 n ˆ 2 0 (1 ˆ. 0 ) Agresti suggested a Pearson χ 2 statistic based on the expected values obtained by substituting (θ, ) = (θ 0, ˆ 0 ) into Table 2. It can be shown that for testing the null value θ 0 = 1, Agresti s statistic is identical to the score statistic Logical properties These statistics are infinite or undefined when certain counts are zero. ince there is no possible disadvantage in breaking ties in test statistics, all statistics are modified by replacing (x,t,n)by (x + ε,t+ 2ε,n+ 3ε) with ε extremely small. This will be the standard form of the statistic. It is more common practice to add 2 1 to all counts to deal with these

5 3768 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) problems and we refer to such statistics as modified. We will also investigate the four statistics with this modification, and denote them by W 1, etc. Finally, all test statistics are each defined to equal zero when x = t = 0 since there is no evidence against the null when no successes are observed. The parameter θ is a ratio of the conditional and marginal probability of success. For fixed t any reasonable test statistic should be increasing in x, and for fixed x decreasing in t. All statistics satisfy these properties apart from W 2 which can be non-increasing in x for fixed t. Wald-type test statistics commonly violate monotonicity conditions, essentially because standard error estimates break down near the boundary of the sample space. It has been previously noted by Tang and Tang (2002) that W 1 and W 2 are not monotone in x for fixed t x. A final point of interest is that the properties of one-sided tests of θ > θ 0 and θ < θ 0 are quite different. For instance, it is clear that the data set (x, t) = (0,n)points most strongly towards θ < θ 0, however it is not at all clear which data set points most strongly towards θ > θ 0. We will study test properties under deviations in both directions and find quite different behaviour of the test statistics. 3. Exact tests and P -values Tang and Tang (2002) have shown numerically that confidence intervals based on W 1 and W 2 can have poor coverage properties even for moderate sample sizes. This leads to poor performance of the implied two-sided test, at least for some null values. In this section we give a brief overview of methods for constructing the so-called exact tests from a given, possibly approximate, test statistic. We have data Y and parameter (θ, ) and want to test the null hypothesis θ = θ 0, against either one or two-sided alternatives. The test statistic generates an approximate P -value, denoted P(Y), from tail probabilities of an approximating null distribution. The exact significance level π(y, ) := Pr(P (Y ) P(y); θ 0, ) depends on the unknown value of. The function π is called the profile of the test statistic in Lloyd (2005). The classical solution is to maximise over the nuisance parameter, see Bickel and Doksum (1977, p. 168), giving the P -value P (Y ) where P (y) := sup π(y, ). This is sometimes called Basu s (1977) maximisation principle. The transformation from P(Y)to the new statistic P (Y ) is called the M-step in Lloyd (2005) who shows that P (Y ) satisfies the defining property of a P -value: sup {Pr(P (Y ) P (y); θ 0, )}=P (y) (2) (θ 0,) Ω and is as small as possible amongst valid P -values that are non-decreasing functions of the original statistic P(Y). The maximised P -value (which we will call the M P -value) depends only on the ordering that the test statistic induces on the sample space. Further details are in Lloyd (2005). Viewed this way, maximisation is an essential step in test construction and so any inadequacies in the generated test require a rethink of the basic test statistic, not the maximisation itself. An alternative to maximising over is to replace it with an estimate ˆ 0 under the null, admittedly not a new idea (see for instance torer and Kim, 1990). This generates a new P -value P(y)= ˆ π(y, ˆ 0 ) and the transformed statistic P(Y)is ˆ called the E P -value. The main reason for the E-step is to obtain a P -value whose profile depends less on. Heuristically, we expect that the estimated P -value imposes a more reasonable ordering on the sample space, because it is not based on an asymptotic approximation which may break down near the boundaries. Of course, the estimated P -value P(Y) ˆ is not exactly valid but can be made so by the M-step resulting in what we call the E + M P -value. Computational issues are described in Appendix B. We briefly illustrate these ideas on Agresti s data. The first columns of Table 3 list the values of the four test statistics. All these generate an approximate one-sided P -value based on the normal distribution. All P -values derived from these will be converted into an equivalent normal quantile to help the reader appreciate the patterns (since the P -values in this example are all rather small). Maximising the profile function typically gives a larger, i.e. less significant, P -value. Column 2 lists the normal quantile equivalent to this M P -value. Looking first at Agresti s statistic, the P -value changes from to 361, the equivalent normal quantiles changing from 4.44 to A graphical explanation is in the top left panel of Fig. 1 where we see a spike in the profile. Much worse behaviour occurs for the standard Wald statistic W 1.

6 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Table 3 Two-sided significance values for example of Agresti (1990, p. 45), expressed in terms of an equivalent Z-statistic enerating statistic Raw M E E + M W W /A e e e e+00 Fig. 1. Plot of significance profile π(y, ) against with data y = (x,t,n)= (30, 93, 156) of Agresti (1990). In each case, the horizontal dashed line is the asymptotic P -value and the vertical dashed line is the profile ML estimate of. Left: Agresti s score statistic. Right: Agresti s statistic after E-step. Columns 3 and 4 give equivalent normal quantiles for the E and E + M P -values. The very minor changes from columns 3 to 4 suggests that the M-step after the E-step is practically unnecessary for this data set. For instance, the equivalent normal quantile based on Agresti s statistic only changes from 4.31 to The profile for the estimated P -value based on Agresti s statistic is in the right panel of Fig. 1 and the spike in the original profile is no longer present. In summary, across these four statistics the E + M P -value provides more evidence of significance than the M P -value and the E + M, and E P -values are almost identical. Of course, the spikes in the profiles can be traced to the high significance these statistics attach to some data sets where x and t are small. While it is well known that asymptotic test statistics of this kind do not perform adequately when counts are small, it is sometimes forgotten that in order to assess the significance of our observed data (x,t,n)= (30, 93, 156) in a frequentist framework we must specify what is to be done in the counterfactual case of small counts. The E-step gives a better ranking of data sets near the boundary in terms of their hostility to the null. The proof of the pudding is in the flatter profile. One cannot conclude much from a single example. For a start we have no idea if the null is actually true or not. The next section presents results on a complete numerical investigation of the performance of the various P -values that we have described. 4. Numerical study We have described four basic test statistics W 1,W 2,, and their modified versions. Each of these eight basic statistics generate four P -values, namely the approximate P -value based on the normal asymptotic, the M P -value, the E P -value and the E + M P -value. Only the M and E + M P -values can be guaranteed as valid. For all possible data sets when n = 50 and 100, all 32 P -values for testing the null hypothesis θ = 1 versus θ = 1 were computed. This allows a full investigation of the performance of the implied tests, without the uncertainty of simulation.

7 3770 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) EM P value M P value EM P value M P value Fig. 2. For n = 100, the left plots compares M and E + M P -values for the modified log-wald statistic W 2 and the right plot for the modified Agresti/score statistic. The E + M P -values are smaller and less discrete than the M P -values M versus E + MP-values We are firstly interested in comparing M P -values and E + M P -values, since only these are guaranteed to lead to valid tests. Fig. 2 presents a plot of the E + M P -values versus the M P -values restricted to the more interesting cases where the P -values are in the range (0, 0.2). The plots are for n = 100 and for the statistics W 2 and, though many similar plots are available in Lloyd (2006). Apparently, the E + M P -values tend to be systematically smaller than the M P -values. For instance, in the right plot when the M P -value is around 0.11, the E + M P -value is typically around or smaller. This does not by itself imply higher, but it certainly anticipates a advantage for the E + M P -values. Explicit comparisons in Lloyd (2006) show that for all eight basic test statistics, E + M P -values are to be preferred to M P -values Power of E + MP-values from different test statistics We next look at the s of the tests based on the E + M P -values. Denote by β α (θ, ) the of the size α test generated by rejecting the null when the P -value is less than or equal to α, defined over the region Ω = {0 1, θ < 1 }. For much of the parameter space the s of all tests will be extreme and of no practical interest. Firstly, as 0 we observe (x, t) = (0, 0) with probability one, all test statistics equal zero and so the is zero. On the other hand, as increases and θ deviates from θ 0 = 1, the will tend to increase. This is reflected in the formula for the standard deviation of ˆθ given in ection 2. Rather than giving contour plots of over the entire parameter space Ω, we focus on a one-dimensional subset where is in an interesting and moderate range, by systematically moving θ closer to θ 0 = 1as increases. We achieve this by choosing θ such that E(2(log l(θ, ) log l(θ 0, ))) = K 2, (3) where K is a quantile of the normal distribution, say ±1or±2. An expression for l(θ, ) was given in (1) which leads to the equation ( ) 1 θ θ 2 log θ + (1 θ) log = K 2 /(2n). 1 If we were investigating normal data, the analogous parameter values would be μ = μ 0 ± Kσ/ n. There will typically be two solutions, one greater and one smaller than θ 0 =1, though when is sufficiently small there will not be a solution less than θ 0 = 1. Calling the solution θ(,k,n), this results in the curves Π(; n, K, α) := β α ( θ(,k,n),). The reader should again note that all s are calculated numerically, not simulated. Figs. 3 and 4 display profiles for E + M P -values based on W 1, W 2, and for n = 50, 100, K =±1, ±2 and α =,. The plots suggest quite different behaviour for detecting alternatives θ > 1orθ < 1. For less than

8 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Fig. 3. Power profiles Π(; 50,K,) of four E + M based tests. Top left: K = 1. Top right. K = 2. Bottom left: K = 1. Bottom right: K = 2. alternatives (K = 1, 2 in top plots), W 1 and perform best with W 1 clearly preferred when < 0.5 and slightly preferred when > 0.5. For greater than alternatives (K = 1, 2 in bottom plots), W 2 is clearly superior while and W 1 perform relatively poorly. The LR statistic performs almost as well as W 2. If a single compromise statistic is to be recommended, E + M P -values derived from the modified LR statistic seem to perform close to best for both kinds of alternatives Modification or no modification? In order to try and further summarise the general patterns, we have calculated the averages of these profiles to give a single measure. While this ignores some of the parameter space, it seems more sensible to take a average along the curve defined by (3) than to take an average of the entire parameter space. The results are in Table 4. The main new insight from this table is that E + M P -values based on the unmodified log-wald statistic W 2 (i.e. without adding 1 2 to all counts) performs even better than the modified log-wald statistic W 2 for greater than alternatives. A plot of the corresponding profiles indicate that the is almost uniformly superior for the test based on unmodified log-wald. It may also be noted that modification of the score statistic is contra-indicated for positive alternatives though neither version of is recommended for such alternatives in any case Is the M-step necessary after the E-step? While it is true that E P -values are not guaranteed to be valid, it may be that for practical purposes the computationally intensive M-step is not required. The simplest way to investigate this is to plot the E P -values against the E + M P - values over the range (0,0.2) of practical interest. In Fig. 5 we present plots for the unmodified log-wald statistic and the modified Wald statistic for n = 100, these being the two best statistics arising from the analysis. It seems

9 3772 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Fig. 4. Power profiles Π(; 100,K,.05) of four E + M based tests. Top left: K = 1. Top right: K = 2. Bottom left: K = 1. Bottom right: K = 2. Table 4 Average profiles of eight alternative E + M P -values α n K W 1 W 1 W 2 W Highest is in bold font. clear from these plots that when the E P -value turns out to be small, say less than, the M-step is hardly necessary. However, for larger P -values the M-step has a non-negligible effect.

10 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) E P value M P value M P value Fig. 5. For n = 100, the left plots compares E and E + M P -values for the modified Wald statistic W 1 and the right plot for the unmodified log-wald statistic W 2. The effect of the theoretically essential M-step is to slightly increase the P -value, but the difference is negligible when the E P -value is less than. 5. Discussion Another method of comparing secondary and primary probability of success is by the simple difference rather than the ratio. Approximate confidence intervals which generate two-sided tests are given by Lui (2000) and Tang and Tang (2003). When only a proportion of individuals have a structural zero, inference has been studied by Tang and Tang (2004). The study in Lloyd (2005) has considered some other basic generating statistics, including one based on the conditional distribution of X given T. This statistic was found to be uncompetitive. Berger and Boos (1994) suggested a quite different method of accounting for nuisance parameters which involves maximising π(y, ) over a (1 γ) confidence region for and adding a penalty γ. Dependence of results on the choice of γ can be extreme, notwithstanding the general recommendation by Berger and idik (2003) that γ be small. It is quite unclear how to extend their ideas for multi-dimensional nuisance parameters. For this and other reasons, it is argued in Lloyd (2005) that the E + M approach is to be preferred. We have not investigated other null values besides θ 0 =1 in this paper though we have in all cases given the P -values in terms of a general null value θ 0. Unlike clinical trials where testing non-null values are of interest for establishing non-inferiority or bio-equivalence, it is not clear that such hypotheses are of interest in the contexts where structural zero matched pairs arise. There remain unresolved computational issues, especially in the maximisation step. A recent paper by Fang and Chen (2003) describes the use of the EM algorithm for this purpose but it is not clear how reliable this methodology is. ome computational issues are described in Appendix B. Computation times are largely dependent on the number of individuals n.forn = 100 we computed all 5151 possible EM P -values in roughly an hour. However, computing of a single P -value involves some overhead computations and when n = 100 a single P -value takes roughly 30 s, using the current unoptimised algorithm. Computation time for this naive algorithm increases with sample size at rate n 4 but can be reduced to O(n 2 log(n)) by using the known monotonicity of the test statistics. Development of efficient algorithms for EM P -value, both in the present context and in general, is an area of future research. R functions are available from the author. Appendix A. Roots of quadratic for profile MLE E P value Firstly, the restricted ML estimator ˆ 0 is an estimator of under the restriction that 2 = θ 0 π 11. Not surprisingly then, it can be shown that ˆ0 is an increasing function of ˆ = t/n and also an increasing function of ˆπ 11 = x/n for fixed t. We will show that the smaller of the two roots is always within [0, min(1, θ 1 )] by considering the data set x = t = 0 associated with the smallest roots and x = t = n associated with the largest roots.

11 3774 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) When x = 0 = t,wehavea = n, b = n, c = 0 and so the roots are n ± n 2 2nθ 0 giving solutions 0 and θ 1 0 and it is easy to check that ˆ 0 = θ 1 0 corresponds to a local maximum. When x = t = n we have a = 2nθ 0,b= 2n(1 + θ 0 ), c = 2n and so the roots are 2n(1 + θ 0 ) ± 4n 2 (1 + θ 0 ) 2 16n 2 θ 0 4nθ 0 = 1 + θ ± 1 θ 0 2θ 0 giving solutions θ 1 0 or 1. It follows that across all possible data sets, the smallest of the two solutions ranges from 0, when x = t = 0, to min(1, θ 1 0 ), when x = t = n. Appendix B. Computational aspects of the estimation and maximisation steps Both the E and M P -values require calculation of the profile π(y, ) := Pr(P (Y ) P(y); θ 0, ), which in turn requires defining the set {P(Y) P(y)}. In the worst case this would take N operations, where N is the cardinality of the sample space. In our example, Y =(X, T ) and the cardinality of the sample space N =n(n+1)/2 and so there are O(n 2 ) required. However, since our test statistics are known to be non-decreasing in x for fixed t, the set can be computed by bisection in x for fixed t which requires O(n log n) operations. The profile, which is the probability of this set, can also be efficiently computed since it is a weighted sum of binomial tail probabilities for each fixed t. The maximisation step requires finding the supremum of π(y, ). This function does not seem to have any special properties, except that it is a polynomial of degree n. It can contain quite extreme spikes. Little attention in the literature seems to have been paid to the possibility of missing such spikes. In our computations we have used a local optimiser in the R computing environment applied separately to a 10 equal sized intervals of [0, 1]. All P -values in this paper can be computed within a few seconds but could be computed much faster using monotonicity. A user friendly function is available from the author. References Agresti, A., Categorical data analysis. first ed. Wiley, New-York. Basu, D., On the elimination of nuisance parameters. J. Amer. tatist. Assoc. 72, Berger, R.L., Boos, D.D., P values maximised over a confidence set for the nuisance parameter. J. Amer. tatist. Assoc. 89, Berger, R.L., idik, K., Exact unconditional tests for a 2 2 matched pairs design. tatist. Methods Med. Res. 12, Bickel, P.J., Doksum, K.A., Mathematical tatistics. Holden-Day, Oakland. Buehler, R.J., Confidence intervals for the product of two binomial parameters. J. Amer. tatist. Assoc. 52, Fang, X.Z., Chen, J., EM algorithm and its application to testing hypotheses. ci. China A 46, Johnson, W.D., May, W.L., Combining 2 2 tables that contain structural zero. tatist. Med. 14, Lloyd, C.J., E + M P -values. Austral. NZ. J. tatist., submitted for publication and available as Working Paper Lloyd, C.J., Efficient and exact tests in a correlated 2 2 table with structural zero. Working Paper Lloyd, C.J., Moldovan, M., 2007a. Exact confidence bounds for the risk ratio in 2 2 tables with structural zero. Biometrical J., to appear. Lloyd, C.J., Moldovan, M., 2007b. Unconditional efficient upper limits for the odds ratio based on conditional likelihood. tatist. Med., to appear. Lui, K.J., Interval estimation of the risk ratio between secondary infection, given a primary infection, and the primary infection. Biometrics 54, Lui, K.J., Confidence intervals of the simple difference between the proportions of a primary infection and a secondary infection, given the primary infection. Biometrical J. 42, torer, B.E., Kim, C., Exact properties of some exact test statistics for comparing two binomial proportions. J. Amer. tatist. Assoc. 85,

12 Chris J. Lloyd / Computational tatistics & Data Analysis 51 (2007) Tang, N.., Tang, M.L., Exact unconditional inference for the risk ratio in a correlated 2 2 table with structural zero. Biometrics 58, Tang, N.., Tang, M.L., tatistical inference for risk difference in an incomplete correlated 2 2 table. Biometrical J. 45, Tang, M.L., Tang, N.., Exact tests for comparing two paired proportions with incomplete data. Biometrical J. 46, Toyota, M., Kudo, K., Kobori, O., High frequency of individuals with strong reactions to tuberculosis among clinical trainees. Japan J. Infectious Disease 52,

Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions

Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions Roger L. Berger Department of Statistics North Carolina State University Raleigh, NC 27695-8203 June 29, 1994 Institute

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Session 3 The proportional odds model and the Mann-Whitney test

Session 3 The proportional odds model and the Mann-Whitney test Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session

More information

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests Chapter 59 Two Correlated Proportions on- Inferiority, Superiority, and Equivalence Tests Introduction This chapter documents three closely related procedures: non-inferiority tests, superiority (by a

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Exact unconditional tests for a 2 2 matched-pairs design

Exact unconditional tests for a 2 2 matched-pairs design Statistical Methods in Medical Research 2003; 12: 91^108 Exact unconditional tests for a 2 2 matched-pairs design RL Berger Statistics Department, North Carolina State University, Raleigh, NC, USA and

More information

Categorical Data Analysis Chapter 3

Categorical Data Analysis Chapter 3 Categorical Data Analysis Chapter 3 The actual coverage probability is usually a bit higher than the nominal level. Confidence intervals for association parameteres Consider the odds ratio in the 2x2 table,

More information

Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero

Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero Clemson University TigerPrints All Theses Theses 8-2013 Objective Bayesian Hypothesis Testing and Estimation for the Risk Ratio in a Correlated 2x2 Table with Structural Zero Xiaohua Bai Clemson University,

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA

Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box 90251 Durham, NC 27708, USA Summary: Pre-experimental Frequentist error probabilities do not summarize

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Hypothesis Testing Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Suppose the family of population distributions is indexed

More information

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions K. Krishnamoorthy 1 and Dan Zhang University of Louisiana at Lafayette, Lafayette, LA 70504, USA SUMMARY

More information

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF

PROD. TYPE: COM. Simple improved condence intervals for comparing matched proportions. Alan Agresti ; and Yongyi Min UNCORRECTED PROOF pp: --2 (col.fig.: Nil) STATISTICS IN MEDICINE Statist. Med. 2004; 2:000 000 (DOI: 0.002/sim.8) PROD. TYPE: COM ED: Chandra PAGN: Vidya -- SCAN: Nil Simple improved condence intervals for comparing matched

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 8 P-values When reporting results, we usually report p-values in place of reporting whether or

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

A Reliable Constrained Method for Identity Link Poisson Regression

A Reliable Constrained Method for Identity Link Poisson Regression A Reliable Constrained Method for Identity Link Poisson Regression Ian Marschner Macquarie University, Sydney Australasian Region of the International Biometrics Society, Taupo, NZ, Dec 2009. 1 / 16 Identity

More information

Constructing Ensembles of Pseudo-Experiments

Constructing Ensembles of Pseudo-Experiments Constructing Ensembles of Pseudo-Experiments Luc Demortier The Rockefeller University, New York, NY 10021, USA The frequentist interpretation of measurement results requires the specification of an ensemble

More information

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Previous lecture P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing. Interaction Outline: Definition of interaction Additive versus multiplicative

More information

Lecture 01: Introduction

Lecture 01: Introduction Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction

More information

A simulation study for comparing testing statistics in response-adaptive randomization

A simulation study for comparing testing statistics in response-adaptive randomization RESEARCH ARTICLE Open Access A simulation study for comparing testing statistics in response-adaptive randomization Xuemin Gu 1, J Jack Lee 2* Abstract Background: Response-adaptive randomizations are

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

n y π y (1 π) n y +ylogπ +(n y)log(1 π). Tests for a binomial probability π Let Y bin(n,π). The likelihood is L(π) = n y π y (1 π) n y and the log-likelihood is L(π) = log n y +ylogπ +(n y)log(1 π). So L (π) = y π n y 1 π. 1 Solving for π gives

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

Ling 289 Contingency Table Statistics

Ling 289 Contingency Table Statistics Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Research Article Efficient Noninferiority Testing Procedures for Simultaneously Assessing Sensitivity and Specificity of Two Diagnostic Tests

Research Article Efficient Noninferiority Testing Procedures for Simultaneously Assessing Sensitivity and Specificity of Two Diagnostic Tests Computational and Mathematical Methods in Medicine Volume 2015, Article ID 128930, 7 pages http://dx.doi.org/10.1155/2015/128930 Research Article Efficient Noninferiority Testing Procedures for Simultaneously

More information

Comparison of Two Samples

Comparison of Two Samples 2 Comparison of Two Samples 2.1 Introduction Problems of comparing two samples arise frequently in medicine, sociology, agriculture, engineering, and marketing. The data may have been generated by observation

More information

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

In Defence of Score Intervals for Proportions and their Differences

In Defence of Score Intervals for Proportions and their Differences In Defence of Score Intervals for Proportions and their Differences Robert G. Newcombe a ; Markku M. Nurminen b a Department of Primary Care & Public Health, Cardiff University, Cardiff, United Kingdom

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis Review Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1 / 22 Chapter 1: background Nominal, ordinal, interval data. Distributions: Poisson, binomial,

More information

Non-parametric confidence intervals for shift effects based on paired ranks

Non-parametric confidence intervals for shift effects based on paired ranks Journal of Statistical Computation and Simulation Vol. 76, No. 9, September 2006, 765 772 Non-parametric confidence intervals for shift effects based on paired ranks ULLRICH MUNZEL* Viatris GmbH & Co.

More information

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ ADJUSTED POWER ESTIMATES IN MONTE CARLO EXPERIMENTS Ji Zhang Biostatistics and Research Data Systems Merck Research Laboratories Rahway, NJ 07065-0914 and Dennis D. Boos Department of Statistics, North

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf 2013-14 We know that X ~ B(n,p), but we do not know p. We get a random sample

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Likelihood and Fairness in Multidimensional Item Response Theory

Likelihood and Fairness in Multidimensional Item Response Theory Likelihood and Fairness in Multidimensional Item Response Theory or What I Thought About On My Holidays Giles Hooker and Matthew Finkelman Cornell University, February 27, 2008 Item Response Theory Educational

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

1 Comparing two binomials

1 Comparing two binomials BST 140.652 Review notes 1 Comparing two binomials 1. Let X Binomial(n 1,p 1 ) and ˆp 1 = X/n 1 2. Let Y Binomial(n 2,p 2 ) and ˆp 2 = Y/n 2 3. We also use the following notation: n 11 = X n 12 = n 1 X

More information

STAT 705: Analysis of Contingency Tables

STAT 705: Analysis of Contingency Tables STAT 705: Analysis of Contingency Tables Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Analysis of Contingency Tables 1 / 45 Outline of Part I: models and parameters Basic

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

Approximate Inference for the Multinomial Logit Model

Approximate Inference for the Multinomial Logit Model Approximate Inference for the Multinomial Logit Model M.Rekkas Abstract Higher order asymptotic theory is used to derive p-values that achieve superior accuracy compared to the p-values obtained from traditional

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

Lecture 21. Hypothesis Testing II

Lecture 21. Hypothesis Testing II Lecture 21. Hypothesis Testing II December 7, 2011 In the previous lecture, we dened a few key concepts of hypothesis testing and introduced the framework for parametric hypothesis testing. In the parametric

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Libraries 1997-9th Annual Conference Proceedings ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS Eleanor F. Allan Follow this and additional works at: http://newprairiepress.org/agstatconference

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Statistics. Statistics

Statistics. Statistics The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,

More information

n =10,220 observations. Smaller samples analyzed here to illustrate sample size effect.

n =10,220 observations. Smaller samples analyzed here to illustrate sample size effect. Chapter 7 Parametric Likelihood Fitting Concepts: Chapter 7 Parametric Likelihood Fitting Concepts: Objectives Show how to compute a likelihood for a parametric model using discrete data. Show how to compute

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Computing Highly Accurate or Exact P-values using Importance Sampling (revised)

Computing Highly Accurate or Exact P-values using Importance Sampling (revised) Melbourne Business School From the SelectedWorks of Chris J. Lloyd February, 2010 Computing Highly Accurate or Exact P-values using Importance Sampling (revised) Chris Lloyd, Melbourne Business School

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test. Hypothesis Testing Hypothesis testing is a statistical problem where you must choose, on the basis of data X, between two alternatives. We formalize this as the problem of choosing between two hypotheses:

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Categorical data analysis Chapter 5

Categorical data analysis Chapter 5 Categorical data analysis Chapter 5 Interpreting parameters in logistic regression The sign of β determines whether π(x) is increasing or decreasing as x increases. The rate of climb or descent increases

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Exact one-sided confidence limits for the difference between two correlated proportions

Exact one-sided confidence limits for the difference between two correlated proportions Melbourne Business School From the SelectedWorks of Chris J. Lloyd Winter December, 2007 Exact one-sided confidence limits for the difference beteen to correlated proportions Chris Lloyd Max V Moldovan

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Revision Class for Midterm Exam AMS-UCSC Th Feb 9, 2012 Winter 2012. Session 1 (Revision Class) AMS-132/206 Th Feb 9, 2012 1 / 23 Topics Topics We will

More information

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota. Fuzzy Confidence Intervals and P -values Charles Geyer University of Minnesota joint work with Glen Meeden University of Minnesota http://www.stat.umn.edu/geyer/fuzz 1 Ordinary Confidence Intervals OK

More information

Applied Mathematics Research Report 07-08

Applied Mathematics Research Report 07-08 Estimate-based Goodness-of-Fit Test for Large Sparse Multinomial Distributions by Sung-Ho Kim, Heymi Choi, and Sangjin Lee Applied Mathematics Research Report 0-0 November, 00 DEPARTMENT OF MATHEMATICAL

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Multiple Sample Categorical Data

Multiple Sample Categorical Data Multiple Sample Categorical Data paired and unpaired data, goodness-of-fit testing, testing for independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information