2.1.3 The Testing Problem and Neave s Step Method

Size: px
Start display at page:

Download "2.1.3 The Testing Problem and Neave s Step Method"

Transcription

1 we can guarantee (1) that the (unknown) true parameter vector θ t Θ is an interior point of Θ, and (2) that ρ θt (R) > 0 for any R 2 Q. These are two of Birch s regularity conditions that were critical under the entire BLIM space Θ. Remarks 1 In the monograph by Doignon and Falmagne (1999), the general BLIM, described by the entire parameter space Θ, is tested for goodness of fit. The reader should note, given the preceding remarks, that under this null model the asymptotic chi-square approximation may not hold true in general. 2 I have the feeling, 6 even the restricted parameter space Θ does not fulfill all the Birch s regularity conditions. To be more concrete, in general, restrictions must be imposed on the BLIM to attain identifiability (be discussed later). 3 Though these aspects, we will use the fictitious example with corresponding concrete results discussed in Doignon and Falmagne (1999) for the general BLIM (i.e., for Θ). But, we will speak of the parameter space Θ rather than Θ. So, the results listed in the following should not be the correct ones when the space Θ is actually data analyzed. However, the example is fictitious, and the exposition is only meant for illustrating the principal concepts of the analysis The Testing Problem and Neave s Step Method How are the goodness-of-fit statistics X 2 and G 2 used to test the fit of the BLIM described by Θ? More precise, how can we test the null hypothesis H 0 : For an appropriate θ Θ, (ρ t (R)) R 2 Q = (ρ θ (R)) R 2 Q versus the (complementary) alternative hypothesis H 1 : There is no θ Θ for which (ρ t (R)) R 2 Q = (ρ θ (R)) R 2 Q? Following the general procedure of classical significance testing, we proceed in several steps: 7 1 Specify the justifiable distributional assumptions underlying the empirical phenomenon from which the data is observed. 2 Based on 1, formulate the empirical problem of interest in terms of H 0, H 1. 3 Based on 1 and 2, choose a test statistic, T (x) (x, the data), and, viewed as a random variable T (X), derive its probability distribution under H 0. 4 Based on 3, choose a critical region (or, rejection region) to the significance level α ]0, 1[ (e.g., α = 0.05). A critical region, C, should consist of values of T which sparsely point to H 0 but rather most strongly to H 1. 6 This should be investigated! 7 This is the step method proposed by Neave (1976). 39

2 Ideally, we should have 8 P H0(T C) α. 5 Based on 4, propose the decision rule. If T (x) C, we reject H 0 in favour of H 1. If T (x) C, we do not reject H 0 in favour of H 1 (and should not conclude with accepting H 0 in general) Step 1: Data and Multinomial Distribution The data, x, is constituted by the observed absolute counts of the response patterns R 2 Q. In other words, x = (N(R)) R 2 Q (see Table 1). We assume that different subjects give their response patterns independent of each other. The (unknown) true probability of occurence, ρ t (R), of any response pattern R 2 Q is assumed to stay constant across the N subjects and to be strictly larger than 0. Then, in the series of N examinees responding to the items, the probability of observing N(R) subjects giving the response pattern R 2 Q (R varying over entire 2 Q ) is given by (m := Q, the number of items) N! 2 m i=1 N(R)! 2 m i=1 ρ t (R) N(R). If we introduce random variables X R (R 2 Q ) that represent the respective observed absolute cell counts N(R), then we can consider x := (N(R)) R 2 Q as a realization of the random vector X := (X R ) R 2 Q. In particular, we can recap the previous probability by the notation P(X = x) := P(X = N( ),..., X Q = N(Q)) = N! 2 m i=1 N(R)! 2 m i=1 ρ t (R) N(R). Here, ρ t (R) > 0 for any R 2 Q, and R 2 Q ρ t (R) = 1. Further, N(R) N {0} with 0 N(R) N for any R 2 Q, and R 2 Q N(R) = N. In other words, X = (X R ) R 2 Q is a multinomial random vector. 9 Definition 2.1. Let t, n N, and let p = (p 1, p 2,..., p t ) R t with p i > 0 (1 i t) and t i=1 p i = 1. Then, a random vector X = (X 1, X 2,..., X t ) is called multinomial with parameters n and probability vector p if P(X = x) := P(X 1 = x 1, X 2 = x 2,..., X t = x t ) = n! t t i=1 x p x i i i! 8 For a test based on an asymptotic H 0 -distribution, this condition may only be satisfied approximately for (finite) large sample sizes. 9 Exercise 2.2 points out that the concept of a multinomial distribution generalizes the known concept of a binomial distribution. 40 i=1

3 for any x = (x 1, x 2,..., x t ) with x i N {0}, 0 x i n, and t i=1 x i = n. In this case, we briefly write X M t (n, p). Let us summarize. The distributional assumption we make is that the cell counts over the collection of all response patterns follow a multinomial probability distribution. More precise, X = (X R ) R 2 Q M 2 m(n, p = (ρ t (R)) R 2 Q) Step 2: Null and Alternative Hypotheses The empirical problem we aim at is to test the fit of the BLIM described by Θ. In other words, the empirical problem of interest is whether the (unknown) true cell probabilities, ρ t (R) (R 2 Q ), of the manifest multinomial distribution X = (X R ) R 2 Q (in 1) can be expressed/explained by this BLIM as an underlying latent psychological model. This concern is already formulated in terms of H 0 and H 1, H 0 : For an appropriate θ Θ, (ρ t (R)) R 2 Q = (ρ θ (R)) R 2 Q, H 1 : There is no θ Θ for which (ρ t (R)) R 2 Q = (ρ θ (R)) R 2 Q Step 3: Test Statistics X 2 and G 2 Based on the steps 1 and 2, let us first motivate the choice of the test statistic T (x), depending on the data x = (N(R)) R 2 Q, and then, viewing T as a random variable, derive its (asymptotic) probability distribution assuming H 0 holds Limiting χ 2 -Distribution of X 2 and G 2 As already indicated, we want to judge the appropriateness of a hypothesized model by making predictions using the model. For instance, Pearson s X 2 and the log-likelihood ratio statistic G 2 are based on the discrepancies between expected frequencies, Nρ θ (R), derived under a specific model θ Θ, and the observed absolute frequencies N(R) (R 2 Q ). However, in case of the null model, H 0, here, no specific model θ Θ (and thus (ρ θ (R)) R 2 Q) is specified. Assuming H 0 holds, the only thing we know is that the true specific model describing the response data is an element of the parameter space Θ. But its value is not known. In particular, we can not make predictions based on this specific model. This complication is resolved as follows. If we assume H 0 is true, i.e., the BLIM based on Θ holds, and we are left with selecting a most plausible vector in Θ, then it is reasonable to approximate, i.e., estimate, the true parameter vector θ t Θ by choosing a value, θ, of Θ which is most consistent with the observed data, in the sense, that it optimizes a certain measure of discrepancy quantifying the consistency between the observed data and the predictions 41

4 made under a specific model. If we concentrate on X 2 and G 2 (for θ Θ, x = (N(R)) R 2 Q the data, and N the sample size), X 2 (θ; x, N) := (N(R) Nρ θ (R)) 2, Nρ θ (R) R 2 Q G 2 (θ; x, N) := 2 ( )} N(R) {N(R) ln, Nρ θ (R) R 2 Q we could choose estimates, θ X 2 resp. θ G 2, optimizing X 2 resp. G 2, in the sense, that they satisfy the optimization problems X 2 ( θ X 2; x, N) = inf θ Θ X2 (θ; x, N), G 2 ( θ G 2; x, N) = inf θ Θ G2 (θ; x, N). Assuming H 0 holds, under Birch s regularity conditions (Birch, 1964) sketched in Section 2.2 these optimization problems have a (generalized) solution θ X 2 resp. θ G 2 (in a generalized parameter space containing the initial space Θ ). 10 These estimates θ X 2(x; N) and θ G 2(x; N) for the (unknown) true parameter vector θ t Θ depend on the data x. They become random variables θ X 2(X; N) and θ G 2(X; N) when replacing the data x with the random vector X (cp. step 1). These are called estimators for the (unknow) true parameter vector θ t Θ. If we suppose that θ X 2(x; N) and θ G 2(x; N) belong to the initial parameter space Θ, 11 we can calculate the expected frequencies, Nρ bθx 2 (R) resp. Nρ b θg 2 (R) (R 2Q ), and the discrepancies X 2 ( θ X 2; x, N) resp. G 2 ( θ G 2; x, N). (Intuitive motivation: It seems plausible that we may expect small values for these statistics in case the null model really holds. Larger values rather speak against the null model.) Indeed, we choose these test statistics in step 3 of Neave s step method: T X 2(x; N) := X 2 ( θ X 2; x, N), T G 2(x; N) := G 2 ( θ G 2; x, N). Next, in step 3, we require distributions for these test statistics to be known under the null model H 0. This is achieved as follows, the main result: This is the theory of generalized minimum distance estimation for multinomial models, e.g., discussed in Bishop et al. (1975) for maximum likelihood estimation, and in Read & Cressie (1988) for the general power-divergence family (see Section 2.2). The generalized parameter space is the closure of the initial parameter space, plus, including a point at infinity depending on whether the initial parameter space is bounded or not. 11 In large samples, under the conditions described above, the probability of having θ X 2(x; N) resp. θ G 2(x; N) not belonging to Θ goes to zero as N. 12 This result will be generalized to the power-divergence family of Read-Cressie statistics of which the two statistics X 2 and G 2 are special members. 42

5 Theorem 2.2 (Main Result). If H 0 holds, s < 2 Q 1 (s N, the number of (unknown) independent parameters of the model), and Birch s regularity conditions are satisfied, then both the statistics T X 2(X; N) and T G 2(X; N) have a limiting chi-square distribution with degrees of freedom df = (2 m 1) s. If we denote the chi-square distribution with d := (2 m 1) s degrees of freedom by χ 2 d, then this result can be recaped by T X 2(X; N), T G 2(X; N) Distr χ 2 d as N, whereupon the symbol Distr stands for convergence in distribution. 13 Proof. See Bishop et al. (1975), or, Read & Cressie (1988) Asymptotic Equivalence, Critical Values, and Finite Sample Approximations Remarks Let the prerequisites be as in Theorem 2.2!!! 1 For X 2 and G 2 have the same asymptotic distribution, χ 2 d, they are called asymptotically equivalent. Sometimes authors utter that both the statistics, X 2 ( θ X 2; X, N) and G 2 ( θ G 2; X, N), nearly yield the same value if N is large and the observed table of cell counts is not sparse. This seems obvious from X 2 ( θ X 2; X, N) = G 2 ( θ G 2; X, N) + o p (1), with o p (1) representing a stochastic sequence, i.e., a sequence of random variables, which converges in probability to the constant 0 as N. 14 Informally, if N is sufficiently large, then with high probability the two statistics will yield nearly the same value. Even more, it holds X 2 (y; X, N) = G 2 (y ; X, N) + o p (1) for any combination of y, y { θ X 2, θ G 2}. In other words, if N is sufficiently large, then with high probability any of these values will be nearly equal. This can be reviewed in Read & Cressie (1988). In particular, using Theorem 2.2, X 2 (y; X, N), G 2 (y ; X, N) Distr χ 2 d as N for any combination of y, y { θ X 2, θ G 2}. In other words, we can use each of the estimators θ X 2(X; N) and θ G 2(X; N) in any of the statistics X 2 (.; X, N) and G 2 (.; X, N) and obtain asymptotically the same limiting χ 2 d-distribution. Thus, 13 In Exercise 2.3, the concept of convergence in distribution is reviewed. 14 In Exercise 2.3, you have to review the concept of convergence in probability. 43

6 e.g., if only the maximum likelihood estimate θ G 2(x; N) is available, 15 we can use it in the computation of Pearson s X 2 (.; x, N), and then we could test the null model, H 0, using this statistic. All these remarks hold because, under the conditions of Theorem 2.2 by the way, these conditions are assumed for the entire Remarks paragraph here, the estimators θ X 2(X; N) and θ G 2(X; N) are best asymptotically normal (BAN; see Read & Cressie, 1988). In particular, θ X 2(X; N) = θ G 2(X; N) + o p (1). In other words, if N is sufficiently large, then with high probability both the estimators will yield nearly the same value. 2 Theorem 2.2 implies, for any c R, lim N P H 0 (T X 2(X; N) > c) = P(χ 2 d > c) = lim N P H 0 (T G 2(X; N) > c). This is because (without stint, consider only T X 2; c R), { } lim P H 0 (T X 2(X; N) > c) = lim 1 P H0 (T X 2(X; N) c) N N { } (i) = lim 1 F TX 2 (X;N)(c) N = lim 1 lim F T N N X 2 (X;N)(c) TH2.2 = 1 F χ 2 d (c) = 1 P(χ 2 d c) = P(χ 2 d > c). Ad (i). F TX 2 (X;N) : R [0, 1], x F TX 2 (X;N)(x) := P(T X 2(X; N) x) is the cumulative distribution function of random variable T X 2(X; N) (cp. Exercise 2.3). In particular, if χ 2 d,α denotes the upper critical value to the significance level α ]0, 1[, in other words, P(χ 2 d > χ2 d,α ) = α, then, for c := χ2 d,α, lim P H 0 (T X 2(X; N) > χ 2 d,α) = lim P H 0 (T G 2(X; N) > χ 2 d,α) N N = P(χ 2 d > χ 2 d,α) = α. Similar expressions hold for any combination of the estimators θ X 2(X; N) resp. θ G 2(X; N) and test statistics X 2 (.; X, N) resp. G 2 (.; X, N) (cp. 1). 15 As we will demonstrate later, maximizing the likelihood function, L(θ; x, N), is equivalent to minimizing the log-likelihood ratio statistic G 2 (θ; x, N) (θ Θ ; fixed data x). 44

7 3 Based on 2, for N sufficiently large, P H0 (T X 2(X; N) > χ 2 d,α) α, P H0 (T G 2(X; N) > χ 2 d,α) α. Similar expressions hold for any combination of the estimators θ X 2(X; N) resp. θ G 2(X; N) and test statistics X 2 (.; X, N) resp. G 2 (.; X, N) (cp. 1). 4 From a practical point of view, the question of How large is sufficiently large? is crucial for the applicability of these asymptotic results. For, in practice only finite sample sizes are available. Let us reflect a bit on this issue. For a finite-sample (i.e., N < ) approximation to be acceptable, N should be sufficiently large. Since the number of cells of the multinomial, i.e., the number of response patterns, 2 Q = 2 m, is fixed, this likely should indicate that the expected frequencies Nρ (R) resp. Nρ bθx b (R) (R 2Q 2 θg ) should be rather large. 2 In other words, informally, large expected frequencies for the response patterns seem to be a plausible necessity for a good finite-sample approximation. Indeed, it is known that the finite-sample χ 2 -approximation for X 2 relies on the expected frequencies in each cell of the multinomial being large. Cochran (1952, 1954) provides a complete bibliography of the early discussions regarding this point. In the early literature, there were a variety of recommendations, rules of thumb, regarding the minimum expected cell frequency required for the χ 2 - approximation to be reasonably accurate. Values for the minimum expected cell frequency from 1 to 20 were suggested, generally based on individual experiences. Good et al. (1970) provide an overview of the historical recommendations. With the advancement of computers, computer-intensive statistical methods have become available. Exact and simulation studies have contributed to the deeper understanding of the χ 2 -approximation for X 2 and G 2 in small samples. The interested reader is refered to Read & Cressie (1988) regarding these issues. They provide a very nice chapter on historical perspectives concerning the two classical goodness-of-fit measures X 2 and G 2. Finally, let us review two rules of thumb for judging on the appropriateness of the χ 2 -approximation in finite samples. A conservative rule of thumb is this. The χ 2 -approximation is accepted if, for any R 2 Q, the expected frequency, Nρ (R) resp. Nρ bθx b 2 θg (R), of response pattern R is greater than 5, i.e., 2 Nρ (R) (resp., Nρ bθx b 2 θg (R)) > 5. 2 If the number of response patterns (i.e., number of cells of the multinomial) is large, the observed data table (see Table 1) likely tends to be rather sparse, i.e., with small observed absolute cell frequencies. In this case, the previous criterion may not be satisfied. Then, a less demanding criterion is (see Fienberg, 1980), for any R 2 Q, the expected frequency, Nρ (R) resp. Nρ bθx b 2 θg (R), of response 2 45

8 pattern R is greater than or equal to 1, i.e., Nρ (R) (resp., Nρ bθx b 2 θg (R)) 1. 2 It should be noted that a general conclusion from a bunch of comparative studies involving X 2 and G 2 is that, under H 0, in finite samples, the limiting χ 2 -distribution approximates the exact (finite-sample) distribution of X 2 more closely than the exact distribution of G 2 (see Read & Cressie, 1988). A last word. What to do if the finite-sample χ 2 -approximation to X 2 and G 2 is considered poor? How might the approximation be improved? There are a variaty of suggestions available regarding this point. For instance, Doignon & Falmagne (1999) suggest the grouping of cells with low expected frequencies if the less-demanding criterion, minimum expected cell frequency 1, fails. Other suggestions, ranging from second-order terms, corrected χ 2 -approximations, moment corrections, adding positive constants to cells, matching tails, log-normal approximations et cetera. Read & Cressie (1988) provide an extensive list of references regarding this important point. 5 The criterion minimum expected cell frequency 1 (see 4) will be used in all the analyses of Section Parameter Estimates and Standard Example We return to the standard example enriched with the fictitious data in Table 1. In this example, the number, s, of (unknown) independent model parameters is (m := Q, the number of items) s (i) = ( H 1) + 2m = (9 1) = 18. Ad (i). This is the general formula for calculating the number of (unknown) independent model parameters of any general BLIM without further restrictions. Because of K H p(k) = 1, we only have H 1 independent state probabilities. Furthermore, for any item q Q, there are two item parameters β q, η q. In particular, s = 18 < 31 = = 2 m 1. Under the null model, H 0, i.e., the BLIM based on the parameter space Θ is correct, (we assume that) Birch s regularity conditions are satisfied (cp. Remarks, 2, on page 39). The generalized parameter space, Θ, containing the initial parameter space Θ, is given by the closure of Θ in R H +2m = R 19 ( H = 9, m = 5), i.e., Θ = [0, 1] 19. The (generalized) minimum chi-square estimate, θ X 2(x; N = 1, 000), obtained from optimizing Pearson s X 2 (θ; x, N = 1, 000) (in the sense of page 42) for the data, x = (N(R)) R 2 Q, in Table 1 is given in next Table The abbreviation prob. stands for probabilities. CE resp. LG stand for careless error resp. lucky guess. 46

9 Table 2 (Generalized) Minimum chi-square estimate θ X 2(x; N = 1, 000) CE prob. LG prob. State prob. State prob. β a = 0.17 η a = 0.00 p( ) = 0.05 p({a, b, c}) = 0.04 β b = 0.17 η b = 0.09 p({a}) = 0.11 p({a, b, d}) = 0.19 β c = 0.20 η c = 0.00 p({b}) = 0.08 p({a, b, c, d}) = 0.19 β d = 0.46 η d = 0.00 p({a, b}) = 0.00 p({a, b, c, e}) = 0.03 β e = 0.20 η e = 0.03 p(q) = 0.31 X 2 ( θ X 2; x, N = 1, 000) [:= inf θ Θ X 2 (θ; x, N = 1, 000)] = 14.7 Remarks 1 Note that θ X 2(x; N = 1, 000) Θ, but θ X 2(x; N = 1, 000) Θ. For instance, p({a, b}) = η a = 0 (i.e., boundary values). 2 From the computational point of view, the optimization problems on page 42 are by no means trivial. In general, solutions to these problems can not be obtained by analytical methods in closed-form expressions. Numeric optimization algorithms are rather used, and the solutions obtained are approximate. Refering to Doignon & Falmagne (1999), the (generalized) minimum chisquare estimate reported above is obtained using a conjugate gradient search algorithm, called PRAXIS 17, by Brent (1973), a modification of Powell s (1964) direction-set method. This algorithm allows for the optimization of a function in several variables without calculating derivatives. It is implemented as C-function by Gegenfurtner (1992). The C-code can be downloaded from the Web site praxis/.html. Other numeric methods, mainly for the computation of maximum likelihood estimates (in particular, the minimum log-likelihood ratio G 2 estimate), are given by iterative procedures, such as iterative proportional fitting (Goodman, 1974; Bishop et al., 1975; Fienberg, 1980;), Fisher s method of scoring (Rao, 1965), the general Expectation-Maximization (EM) algorithm (Dempster et al., 1977; McLachlan & Krishnan, 1997), and methods of the Newton-Raphson type (Haberman, 1974, 1978, 1979). Denteneer & Verbeek (1986) provide a series of efficiency comparisons for various implementations of these algorithms. The (generalized) minimum G 2 estimate, i.e., (generalized) maximum likelihood estimate, θ G 2(x; N = 1, 000), obtained from optimizing G 2 (θ; x, N = 1, 000) (see page 42) for the data, x = (N(R)) R 2 Q, in Table 1 is given in Table Table 3 (Generalized) Minimum G 2 /ML estimate θ G 2(x; N = 1, 000) 17 PRAXIS stands for PRincipal AXIS (Brent, 1973). 18 This is done using Brent s (1973) PRAXIS method (cp. previous Remarks, 2). ML stands for maximum likelihood. Other abbreviations are defined as in Table 2. 47

10 CE prob. LG prob. State prob. State prob. β a = 0.16 η a = 0.04 p( ) = 0.05 p({a, b, c}) = 0.08 β b = 0.16 η b = 0.10 p({a}) = 0.10 p({a, b, d}) = 0.15 β c = 0.19 η c = 0.00 p({b}) = 0.08 p({a, b, c, d}) = 0.16 β d = 0.29 η d = 0.00 p({a, b}) = 0.04 p({a, b, c, e}) = 0.10 β e = 0.14 η e = 0.02 p(q) = 0.21 X 2 ( θ G 2; x, N = 1, 000) [:= inf θ Θ G 2 (θ; x, N = 1, 000)] = 12.6 Summary In this standard example, the test statistics chosen in step 3 of Neave s step method take the values T X 2(x; N) := X 2 ( θ X 2; x, N) = 14.7, T G 2(x; N) := G 2 ( θ G 2; x, N) = Step 4: Critical Region and Standard Example Next, we deal with step 4 of Neave s step method (see page 39). Under the conditions described in Theorem 2.2, both the statistics T X 2(X; N) and T G 2(X; N) have a limiting chi-square distribution with degrees of freedom d = (2 m 1) s. In this example, under H 0, we have d = (2 5 1) 18 = 13. The choice of a critical region to significance level α ]0, 1[ (e.g., α = 0.05) is straightforward. As already mentioned, it seems plausible that we may expect small values for these statistics in case the null model really holds. Larger values rather speak against the null model. This can be quantified using the upper critical value, χ 2 13,α (d = 13), of the chi-square distribution χ 2 13 to the significance level α. Then, values of the test statistics T X 2(X; N) and T G 2(X; N) greater than χ 2 13,α are viewed as critical values pointing to H 1 rather than H 0. (In this way, we take account of the principle that a critical region should consist of values of a test statistic which sparsely point to H 0 but rather most strongly to H 1.) More precise, the critical/rejection region, C, to the significance level α is chosen to be the following interval in R: C := ]χ 2 13,α, + [. Then, for N sufficiently large, the probability of rejecting H 0, given H 0 is correct, 19 is quantified by the significance level α (cp. Remarks, 3, page 45), P H0 (T X 2(X; N) C) α, P H0 (T G 2(X; N) C) α. 19 This error is called error of the first type, or, Type I error. There is also the error of the second type, or, Type II error, committed by accepting H 0 when in fact H 0 is false. 48

11 2.1.8 Step 5: Decision Rule and Standard Example Finally, we come to the last step 5 of Neave s step method. The decision rule is this. If T X 2(x; N) resp. T G 2(x; N) is a value in C, then we reject H 0 in favour of H 1. Otherwise, i.e., T X 2(x; N) resp. T G 2(x; N) not in C, we do not reject H 0 in favour of H 1 (and should not conclude accepting H 0 in general). 20 For instance, if α = 0.05, C := ]χ 2 13,0.05, + [ = ]22.36, + [. The values T X 2(x; N) = 14.7 and T G 2(x; N) = 12.6 do not belong to the critical region C. Therefore, H 0 is not rejected based on either the test statistics. 2.2 Power-Divergence Family In this Section 2.2, we review the so-called power-divergence family of goodnessof-fit statistics introduced by Cressie & Read (1984). This family includes the traditional X 2 and G 2 statistics described in Section 2.1, and it contains various other goodness-of-fit statistics proposed in the literature (e.g., Freeman-Tukey statistic, modified log-likelihood ratio statistic, Neyman-modified X 2 statistic). The power-divergence family provides a unification of these various common goodness-of-fit statistics. Based on this unification, general results concerning the behavior, similarities, and differences of these statistics can be derived and valuable alternatives to them be suggested (see Read & Cressie, 1988). Throughout, in Section 2.2, we use the following notation Notation Multinomial Models Let X = (X 1, X 2,..., X t ) (t N) be a t-dimensional random vector with the multinomial distribution X M t (n, π), where n N, and π = (π 1, π 2,..., π t ) is the vector of true cell probabilities π i > 0 (1 i t) with t i=1 π = 1; see Definition 2.1. Thus, for any x = (x 1, x 2,..., x t ) with x i N {0}, 0 x i n, and t i=1 x i = n, the probability P(X = x) of the vector x of cell counts is Remarks P(X = x) := P(X 1 = x 1, X 2 = x 2,..., X t = x t ) = n! t t i=1 x π x i i. i! 20 A value T X 2(x; N) resp. T G 2(x; N) leading to a rejection is called significant. i=1 49

Analysis of Multinomial Response Data: a Measure for Evaluating Knowledge Structures

Analysis of Multinomial Response Data: a Measure for Evaluating Knowledge Structures Analysis of Multinomial Response Data: a Measure for Evaluating Knowledge Structures Department of Psychology University of Graz Universitätsplatz 2/III A-8010 Graz, Austria (e-mail: ali.uenlue@uni-graz.at)

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Chapter 4. Theory of Tests. 4.1 Introduction

Chapter 4. Theory of Tests. 4.1 Introduction Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule

More information

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015 STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

Ling 289 Contingency Table Statistics

Ling 289 Contingency Table Statistics Ling 289 Contingency Table Statistics Roger Levy and Christopher Manning This is a summary of the material that we ve covered on contingency tables. Contingency tables: introduction Odds ratios Counting,

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between 7.2 One-Sample Correlation ( = a) Introduction Correlation analysis measures the strength and direction of association between variables. In this chapter we will test whether the population correlation

More information

14.30 Introduction to Statistical Methods in Economics Spring 2009

14.30 Introduction to Statistical Methods in Economics Spring 2009 MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation, Sampling Distributions & Hypothesis Testing Parameter Estimation & Hypothesis Testing In doing research, we are usually interested in some feature of a population distribution (which

More information

Mathematical Statistics

Mathematical Statistics Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics

More information

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Partitioning the Parameter Space. Topic 18 Composite Hypotheses Topic 18 Composite Hypotheses Partitioning the Parameter Space 1 / 10 Outline Partitioning the Parameter Space 2 / 10 Partitioning the Parameter Space Simple hypotheses limit us to a decision between one

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

F79SM STATISTICAL METHODS

F79SM STATISTICAL METHODS F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is

More information

Statistical Tests for Parameterized Multinomial Models: Power Approximation and Optimization. Edgar Erdfelder University of Mannheim Germany

Statistical Tests for Parameterized Multinomial Models: Power Approximation and Optimization. Edgar Erdfelder University of Mannheim Germany Statistical Tests for Parameterized Multinomial Models: Power Approximation and Optimization Edgar Erdfelder University of Mannheim Germany The Problem Typically, model applications start with a global

More information

Testing and Model Selection

Testing and Model Selection Testing and Model Selection This is another digression on general statistics: see PE App C.8.4. The EViews output for least squares, probit and logit includes some statistics relevant to testing hypotheses

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models

f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models IEEE Transactions on Information Theory, vol.58, no.2, pp.708 720, 2012. 1 f-divergence Estimation and Two-Sample Homogeneity Test under Semiparametric Density-Ratio Models Takafumi Kanamori Nagoya University,

More information

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics

Dover- Sherborn High School Mathematics Curriculum Probability and Statistics Mathematics Curriculum A. DESCRIPTION This is a full year courses designed to introduce students to the basic elements of statistics and probability. Emphasis is placed on understanding terminology and

More information

Hypothesis Testing - Frequentist

Hypothesis Testing - Frequentist Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating

More information

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test. Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation σ of a normally distributed measurement and to test the goodness

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV Theory of Engineering Experimentation Chapter IV. Decision Making for a Single Sample Chapter IV 1 4 1 Statistical Inference The field of statistical inference consists of those methods used to make decisions

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

http://www.math.uah.edu/stat/hypothesis/.xhtml 1 of 5 7/29/2009 3:14 PM Virtual Laboratories > 9. Hy pothesis Testing > 1 2 3 4 5 6 7 1. The Basic Statistical Model As usual, our starting point is a random

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments

Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments We consider two kinds of random variables: discrete and continuous random variables. For discrete random

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses 280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE 9-1.2 Tests of Statistical Hypotheses To illustrate the general concepts, consider the propellant burning rate problem introduced earlier. The null

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Review. December 4 th, Review

Review. December 4 th, Review December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter

More information

Open Problems in Mixed Models

Open Problems in Mixed Models xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For

More information

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores

More information

Lecture 21. Hypothesis Testing II

Lecture 21. Hypothesis Testing II Lecture 21. Hypothesis Testing II December 7, 2011 In the previous lecture, we dened a few key concepts of hypothesis testing and introduced the framework for parametric hypothesis testing. In the parametric

More information

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1 4 Hypothesis testing 4. Simple hypotheses A computer tries to distinguish between two sources of signals. Both sources emit independent signals with normally distributed intensity, the signals of the first

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

INTERVAL ESTIMATION AND HYPOTHESES TESTING

INTERVAL ESTIMATION AND HYPOTHESES TESTING INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Hypothesis testing. Anna Wegloop Niels Landwehr/Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Hypothesis testing Anna Wegloop iels Landwehr/Tobias Scheffer Why do a statistical test? input computer model output Outlook ull-hypothesis

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

Applied Mathematics Research Report 07-08

Applied Mathematics Research Report 07-08 Estimate-based Goodness-of-Fit Test for Large Sparse Multinomial Distributions by Sung-Ho Kim, Heymi Choi, and Sangjin Lee Applied Mathematics Research Report 0-0 November, 00 DEPARTMENT OF MATHEMATICAL

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Lecture 10: Generalized likelihood ratio test

Lecture 10: Generalized likelihood ratio test Stat 200: Introduction to Statistical Inference Autumn 2018/19 Lecture 10: Generalized likelihood ratio test Lecturer: Art B. Owen October 25 Disclaimer: These notes have not been subjected to the usual

More information

The Multinomial Model

The Multinomial Model The Multinomial Model STA 312: Fall 2012 Contents 1 Multinomial Coefficients 1 2 Multinomial Distribution 2 3 Estimation 4 4 Hypothesis tests 8 5 Power 17 1 Multinomial Coefficients Multinomial coefficient

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Hypothesis testing:power, test statistic CMS:

Hypothesis testing:power, test statistic CMS: Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this

More information

Topic 15: Simple Hypotheses

Topic 15: Simple Hypotheses Topic 15: November 10, 2009 In the simplest set-up for a statistical hypothesis, we consider two values θ 0, θ 1 in the parameter space. We write the test as H 0 : θ = θ 0 versus H 1 : θ = θ 1. H 0 is

More information

Testing Independence

Testing Independence Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1

More information

MATH 240. Chapter 8 Outlines of Hypothesis Tests

MATH 240. Chapter 8 Outlines of Hypothesis Tests MATH 4 Chapter 8 Outlines of Hypothesis Tests Test for Population Proportion p Specify the null and alternative hypotheses, ie, choose one of the three, where p is some specified number: () H : p H : p

More information

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

HYPOTHESIS TESTING: FREQUENTIST APPROACH. HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous

More information

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses. Stat 300A Theory of Statistics Homework 7: Solutions Nikos Ignatiadis Due on November 28, 208 Solutions should be complete and concisely written. Please, use a separate sheet or set of sheets for each

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Information measures in simple coding problems

Information measures in simple coding problems Part I Information measures in simple coding problems in this web service in this web service Source coding and hypothesis testing; information measures A(discrete)source is a sequence {X i } i= of random

More information

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

EC2001 Econometrics 1 Dr. Jose Olmo Room D309 EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population:

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

The Chi-Square Distributions

The Chi-Square Distributions MATH 03 The Chi-Square Distributions Dr. Neal, Spring 009 The chi-square distributions can be used in statistics to analyze the standard deviation of a normally distributed measurement and to test the

More information

P Values and Nuisance Parameters

P Values and Nuisance Parameters P Values and Nuisance Parameters Luc Demortier The Rockefeller University PHYSTAT-LHC Workshop on Statistical Issues for LHC Physics CERN, Geneva, June 27 29, 2007 Definition and interpretation of p values;

More information

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49 4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004 Hypothesis Testing BS2 Statistical Inference, Lecture 11 Michaelmas Term 2004 Steffen Lauritzen, University of Oxford; November 15, 2004 Hypothesis testing We consider a family of densities F = {f(x; θ),

More information

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007

Introduction Large Sample Testing Composite Hypotheses. Hypothesis Testing. Daniel Schmierer Econ 312. March 30, 2007 Hypothesis Testing Daniel Schmierer Econ 312 March 30, 2007 Basics Parameter of interest: θ Θ Structure of the test: H 0 : θ Θ 0 H 1 : θ Θ 1 for some sets Θ 0, Θ 1 Θ where Θ 0 Θ 1 = (often Θ 1 = Θ Θ 0

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

2.3 Analysis of Categorical Data

2.3 Analysis of Categorical Data 90 CHAPTER 2. ESTIMATION AND HYPOTHESIS TESTING 2.3 Analysis of Categorical Data 2.3.1 The Multinomial Probability Distribution A mulinomial random variable is a generalization of the binomial rv. It results

More information

Topic 19 Extensions on the Likelihood Ratio

Topic 19 Extensions on the Likelihood Ratio Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis

More information

Statistical Estimation

Statistical Estimation Statistical Estimation Use data and a model. The plug-in estimators are based on the simple principle of applying the defining functional to the ECDF. Other methods of estimation: minimize residuals from

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence

More information

Detection theory. H 0 : x[n] = w[n]

Detection theory. H 0 : x[n] = w[n] Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Quantitative Biology II Lecture 4: Variational Methods

Quantitative Biology II Lecture 4: Variational Methods 10 th March 2015 Quantitative Biology II Lecture 4: Variational Methods Gurinder Singh Mickey Atwal Center for Quantitative Biology Cold Spring Harbor Laboratory Image credit: Mike West Summary Approximate

More information

Lecture 12 November 3

Lecture 12 November 3 STATS 300A: Theory of Statistics Fall 2015 Lecture 12 November 3 Lecturer: Lester Mackey Scribe: Jae Hyuck Park, Christian Fong Warning: These notes may contain factual and/or typographic errors. 12.1

More information

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2 Problem.) I will break this into two parts: () Proving w (m) = p( x (m) X i = x i, X j = x j, p ij = p i p j ). In other words, the probability of a specific table in T x given the row and column counts

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Topic 17: Simple Hypotheses

Topic 17: Simple Hypotheses Topic 17: November, 2011 1 Overview and Terminology Statistical hypothesis testing is designed to address the question: Do the data provide sufficient evidence to conclude that we must depart from our

More information

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain

More information