Statistical Hypothesis Testing
|
|
- Kory Hensley
- 6 years ago
- Views:
Transcription
1 Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis.
2 Section 7.1 Tests about Proportions A statistical hypothesis test is a formal method of making decisions, upon the probabilistic structure of a random mathematical model, by analyzing the available sample Example (Simple) Null hypothesis: H 0 : p = 0.06 completely specifies the distribution Against (Composite) Alternative hypothesis: H 1 : p < 0.06 does not completely specify the distribution; it is composed of many simple hypotheses Possible error: Type I error: Rejecting H 0 and accepting H 1 when H 0 is true; Type II error: Failing to reject H 0 when H 1 is true (i.e., when H 0 is false).
3 Section 7.1 Tests about Proportions Consider the test of H 0 : p = p 0 against H 1 : p > p 0, where p 0 = probability of success. Base our test upon the number of successes Y in n independent Bernoulli trials. Using CLT, Y /n has an approximate normal distribution N[p 0, p 0 (1 p 0 )/n], provided that H 0 : p = p 0 is true and n is large. We intend to reject H 0 and accepts H 1 if and only if Z = Y /n p 0 p0 (1 p 0 )/n z α. That is to say: if Y /n exceeds p 0 by z α standard deviations of Y /n, we reject H 0 and accept the hypothesis H 1 : p > p 0. The approximate probability of this occurring when H 0 : p = p 0 is true is α. The significance level of this test is approximately α.
4 Section 7.1 Tests about Proportions Example 7.1-1: Many commercially manufactured dice are not fair because the spots are really indentations, so that, for example, the 6-side is lighter then the 1-side. Let p = the probability of rolling a 6. To test H 0 : p = 1/6 against the alternative hypothesis H 1 : p > 1/6. Suppose that we have a total of n = 8000 observations. Let Y equal the number of times that 6 resulted in the 8000 trials. The results of the experiment yielded y = 1389, so the calculated value of the test statistic is z = 1389/8000 1/6 (1/6)(5/6)/8000 = > = z and hence, the null hypothesis is rejected, and the experimental results indicate that these dice favor a 6 more than a fair die would be.
5 Section 7.1 Tests about Proportions Formal Statistical Hypothesis Testing can be regarded as a statistical version of Mathematical Proof by Contradiction. An example of the latter is Euclid s proof of infinitely many primes. Analogy: (1) H 1 VS H 0 (C 1 : Infinitely Many Primes VS C 0 Finitely Many Primes ) (2) A random sequence of sample X 1,..., X n under H 0 (A finite deterministic sequence of primes p 1,..., p n under C 0 ) Y /n p 0 (3) Functional inequality Z = p0 (1 p 0 )/n z α (a new positive integer p = p 1 p n + 1) (4) Definite conclusion (conclusion subject to chance) A reasonably good test for a parameter normally relies on the maximum likelihood estimator (more precisely, the sufficient statistic) for the parameter.
6 Section 7.1 Tests about Proportions One-sided tests: H 0 : p = p 0 against H 1 : p < p 0 and H 0 : p = p 0 against H 1 : p > p 0 Two-sided tests: H 1 : p p 0. In the Example 7.1-1, a test with the approximate significance level α for doing this is to reject H 0 : p = p 0 against H 1 : p p 0 if Z = Y /n p 0 p0 (1 p 0 )/n z α/2, since, under H 0, P( Z z α/2 ) α. The rejection region for H 0 is often called the critical region. The p-value associated with a test is the probability, under the null hypothesis H 0, that the test statistic (a random variable) is equal to or exceeds the observed value (a constant) of the test statistic in the direction of the alternative hypothesis.
7 Section 7.1 Tests about Proportions Test about difference of two proportions: let Y 1 and Y 2 represent, respectively, the numbers of observed successes in n 1 and n 2 independent trials with probabilities of success p 1 and p 2. To test H 0 : p 1 p 2 = 0 or, equivalently, H 0 : p 1 = p 2, let p = p 1 = p 2 be the common value under H 0. p 1 = Y 1 /n 1 is approximately N[p 1, p 1 (1 p 1 )/n 1 ], p 2 = Y 2 /n 2 is approximately N[p 2, p 2 (1 p 2 )/n 2 ], and p 1 p 2 = Y 1 /n 1 Y 2 /n 2 is approximately N[p 1 p 2, p 1 (1 p 1 )/n 1 + p 2 (1 p 2 )/n 2 ] Estimate p with p = (Y 1 + Y 2 )/(n 1 + n 2 ) Rely a test on a test statistic: Z = p 1 p 2 0 p(1 p)(1/n1 + 1/n 2 ), which has an approximate N(0, 1) distribution when the null hypothesis is true.
8 Section 7.1 Tests about Proportions Remark: In testing both H 0 : p = p 0 and H 0 : p 1 = p 2, statisticians sometimes use different denominators for z. For tests of single proportions, p 0 (1 p 0 )/n can be replaced by (y/n)(1 y/n)/n, and for tests of the equality of two proportions, the following denominator can be used: p 1 (1 p 1 ) + p 2(1 p 2 ). n 1 n 2 In general, it is difficult to say that one is better than the other; fortunately, the numerical answers are about the same.
9 Section 7.2 Tests about One Mean To test which of the two hypotheses, H 0 or H 1, is true, it is necessary to partition the sample space into two parts, C and C, such that if (x 1, x 2,..., x n ) C, H 0 is rejected, and if (x 1, x 2,..., x n ) C, H 0 is accepted (not rejected). The rejection region C for H 0 is called the critical region for the test. The partitioning of the sample space is specified in terms of the values of a test statistic Type I error: If (x 1, x 2,..., x n ) C when H 0 is true. The probability of a Type I error is called the significance level of the test and is denoted by α, i.e. α = P[(X 1, X 2,..., X n ) C; H 0 ] Type II error: If (x 1, x 2,..., x n ) C when H 1 is true. The probability of a Type II error is denoted by β; β = P[(X 1, X 2,..., X n ) C ; H 1 ]
10 Section 7.2 Tests about One Mean A decrease in the size of α leads to an increase in the size of β. Both α and β can be decreased if the sample size n is increased.
11 Section 7.2 Tests about One Mean Sampling from a normal distribution, the null hypothesis is generally of the form H 0 : µ = µ 0. Three possibilities for the alternative hypothesis: i) µ has increased, or H 1 : µ > µ 0 ; ii) µ has decreased, or H 1 : µ < µ 0 ; iii) µ has changed, but it is not known whether it has increased or decreased; two-sided alternative hypothesis: H 1 : µ µ 0. A random sample is taken from the distribution. Observed sample mean, x, that is close (measured in terms of standard deviations of X, σ/ n) to µ 0 supports H 0 (I) When the variance is known, consider a test statistic, Z = X µ 0 σ 2 /n = X µ 0 σ/ n, and critical regions, at a significance level α, for the three respective alternative hypotheses would be (i) z z α, (ii), z z α and (iii) z z α/2.
12 Section 7.2 Tests about One Mean (II) When the variance is not known, we consider the test statistic: T = X µ S 2 /n = X µ S/ n. The rule that rejects H 0 : µ = µ 0 and accepts H 1 ; µ µ 0 if and only if t = x µ 0 s/ n t α/2(n 1) General comment: many statisticians believe that the observed p-value provides an understandable measure of the truth of H 0 : The smaller the p-value, the less they believe in H 0. We do not reject H 0 if the confidence interval covers µ; otherwise, we would have to reject H 0. Many statisticians believe that estimation is much more important than tests of hypotheses and accordingly approach statistical tests through confidence intervals.
13 Section 7.3 Tests of the Equality of Two Means A sample: (X 1, Y 1 ),..., (X n, Y n ). If X and Y are dependent, for example, patient s records before and after a treatment. Let W = X Y, and the hypothesis that H 0 : µ X = µ Y would be replaced with the hypothesis H 0 : µ W = 0. (I) X and Y are independent and normally distributed. Assumed that the variances of X and Y were equal. X Y T = {[(n 1)SX 2 + (m 1)S Y 2 ]/(n + m 2)}(1/n + 1/m) = X Y S p 1/n + 1/m, S p = (n 1)S 2 X +(m 1)S 2 Y n+m 2. T has a t distribution with r = n + m 2 degrees of freedom when H 0 is true and the variances are (approximately) equal.
14 Section 7.3 Tests of the Equality of Two Means If the common-variance assumption is violated, but not too badly, the test is satisfactory, but the significance levels are only approximate. (II) If both the variances of X and Y are unequal yet they are known, then the appropriate test statistic to use for testing H 0 : µ X = µ Y is Z = X Y, σ 2 X n + σ2 Y m which has a standard normal distribution when the null hypothesis is true. (III) If the variances are unknown and unequal, and the sample sizes are large, replace σ 2 X with S 2 X and σ2 Y with S 2 Y in the above equation. The resulting statistic will have an approximate N(0, 1) distribution.
15 Section 7.3 Tests of the Equality of Two Means As long as the underlying distributions are not highly skewed, the normal assumptions are not too critical. As distributions become non-normal and highly skewed, the sample mean and sample variance become more dependent. Some of the nonparametric methods have to be used. When the distributions are close to normal, but the variances seem to differ by a great deal, the t statistic should again be avoided, particularly if the same sizes are also different. (IV) Different values of variances and with small sample size, use Welch s t-statistic.
16 Example on Hypothesis Testing: Classroom activities Source from Beau Lotto Exercises: (1) A single mean test for each class; (2) Test of the equality of means from two classes.
17 Section 7.4 Tests for Variances (I) Test of hypothesis for a single variance, H 0 : σ 2 X = σ2 0, with normal distributions: the critical region is also given in terms of the chi-square test statistic χ 2 (n 1)S 2 =. (II) Test for the equality of two variances, H 0 : σ 2 X /σ2 Y = 1, from normal populations. Two random samples of n observations of X and m observations of Y. When H 0 is true, F = σ 2 0 (n 1)SX 2 σx 2 (n 1) (m 1)SY 2 σy 2 (m 1) = S 2 X S 2 Y has an F distribution with r 1 = n 1 and r 2 = m 1 degrees of freedom. If H 0 is true, the observed value of F is expected to be close to 1.
18 Section 7.5 One-Factor Analysis of Variance (ANOVA) Experimenters want to compare more than two treatments, e.g. yields of several different corn hybrids, results due to three or more teaching techniques, or miles per gallon obtained from many different types of compact cars, consumptions from different class (upper, middle, or lower). Consider m normal distributions with unknown means µ 1, µ 2,..., µ m and an unknown, but common, variance σ 2. A test of the equality of the m means, namely, H 0 : µ 1 = µ 2 = = µ m = µ, with µ unspecified, against all possible alternative hypotheses H 1.
19 Section 7.5 One-Factor Analysis of Variance (ANOVA) Let X i1, X i2,, X ini represent a random sample of size n i from the normal distribution N(µ i, σ 2 ), i = 1, 2,..., m. With n = n 1 + n n m, we denote sample means by: X.. = 1 m n i X ij and X i. = 1 n i X ij, i = 1, 2,..., m. n n i i=1 j=1 j=1 SS(TO) = = = m n i (X ij X.. ) 2 i=1 j=1 n i m (X ij X i. + X i. X.. ) 2 i=1 j=1 n i m (X ij X i. ) 2 + i=1 j=1 + 2 m n i (X i. X.. ) 2 i=1 j=1 m n i (X ij X i. )(X i. X.. ). i=1 j=1
20 Section 7.5 One-Factor Analysis of Variance (ANOVA) Using the facts: m n i m 2 (X i. X.. ) (X ij X i. ) = 2 (X i. X.. )(n i X i. n i X i. ) i=1 j=1 i=1 = 0, and m n i (X i. X.. ) 2 = i=1 j=1 We deduce that m n i (X i. X.. ) 2. i=1 SS(TO) = m n i m (X ij X i. ) 2 + n i (X i. X.. ) 2. i=1 j=1 i=1
21 Section 7.5 One-Factor Analysis of Variance (ANOVA) SS(TO) = m n i (X ij X.. ) 2, the total sum of squares; i=1 j=1 SS(E) = m n i (X ij X i. ) 2, the sum of squares within treatments, i=1 j=1 SS(T ) = m n i (X i. X.. ) 2, i=1 groups, or classes, often called the error sum of squares; the sum of squares among the different treatments, groups, or classes, often called the between-treatment sum of squares. SS(TO) = SS(E) + SS(T ).
22 Section 7.5 One-Factor Analysis of Variance (ANOVA) SS(TO)/σ 2 is χ 2 (n 1), so E[SS(TO)/(n 1)] = σ 2. n i j=1 (X ij X i. ) 2 W i = for i = 1, 2,..., m, n i 1 (n i 1)W i /σ 2 is χ 2 (n i 1). Therefore, no matter H 0 is true or not, m (n i 1)W i σ 2 = SS(E) σ 2, i=1 is also chi-square with (n 1 1) + (n 2 1) + + (n m 1) = n m degrees of freedom. SS(TO) σ 2 where SS(TO) σ 2 is χ 2 (n 1) and = SS(E) σ 2 + SS(T ) σ 2, SS(E) σ 2 is χ 2 (n m).
23 Section 7.5 One-Factor Analysis of Variance (ANOVA) (Theorem 7.5-1) Let Q = Q 1 + Q Q k, where Q, Q 1,..., Q k are k + 1 real quadratic forms in n mutually independent (mean zero) random variables normally distributed with the same variance σ 2. Let Q/σ 2, Q 1 /σ 2,..., Q k 1 /σ 2 have chi-square distributions with r, r 1,..., r k 1 degrees of freedom, respectively. If Q k is nonnegative, then (a) Q 1,..., Q k are mutually independent, and hence, (b) Q k /σ 2 has a chi-square distribution with r (r r k 1 ) = r k degrees of freedom. Applications: (1) Re-deriving: (1) The independence of X and S 2 ; (2) Distribution of (n 1)S 2 /σ 2. (2) Because SS(T ) 0, applying the Theorem, we deduce that SS(E) and SS(T ) are independent and the distribution of SS(T )/σ 2 is χ 2 (m 1).
24 Section 7.5 One-Factor Analysis of Variance (ANOVA) Back to testing H 0 : µ 1 = µ 2 = = µ m = µ Note that SS(E)/(n m) is always unbiased no matter whether H 0 is true or false If µ 1, µ 2,..., µ m are not equal, the expected value of the estimator that is based on SS(T ) will be greater than σ 2. [ m ] [ m ] E[SS(T )] = E n i (X i. X.. ) 2 = E n i X 2 i. nx 2.. = = i=1 i=1 m n i {Var(X i. ) + [E(X i. )] 2 } n{var(x.. ) + [E(X.. )] 2 } i=1 m i=1 n i { σ 2 n i = (m 1)σ µ 2 i } { } σ 2 n n + µ2 m n i (µ i µ) 2, i=1 where µ = (1/n) m i=1 n iµ i..
25 Section 7.5 One-Factor Analysis of Variance (ANOVA) If µ 1 = µ 2 = = µ m = µ, then ( ) SS(T ) E = σ 2. m 1 If the means are not all equal, then E ( ) SS(T ) = σ 2 + m 1 m i=1 n i (µ i µ) 2 m 1 > σ 2. Base our test of H 0 on the ratio of SS(T )/(m 1) and SS(E)/(n m), both of which are unbiased estimators of σ 2, under H 0, the ratio would assume values near 1. In the case that the means µ 1, µ 2,..., µ m begin to differ, this ratio tends to become large, since E[SS(T )/(m 1)] gets larger.
26 Section 7.5 One-Factor Analysis of Variance (ANOVA) Under H 0, SS(T )/(m 1) SS(E)/(n m) = [SS(T )/σ2 ]/(m 1) [SS(E)/σ 2 ]/(n m) = F has an F distribution with m 1 and n m degrees of freedom because SS(T )/σ 2 and SS(E)/σ 2 are independent chi-square variables. We shall reject H 0 if the observed value of F is too large, and the critical region is of the form F F α (m 1, n m).
27 Section 7.5 One-Factor Analysis of Variance (ANOVA) Alternative formulas: SS(TO) = SS(T ) = m n i i=1 j=1 m 1 n i i=1 X 2 ij 1 n n i j=1 X ij m n i SS(E) = SS(TO) SS(T ). 2 i=1 j=1 X ij 2 1 m n i n, i=1 j=1 F test works quite well even if the underlying distributions are non-normal, unless they are highly skewed or the variances are quite different. X ij 2,
28 Section 7.5 One-Factor Analysis of Variance (ANOVA) For only 2 populations, comparison with T-test for a symmetric Two-sided test: under common variance assumption T = ( 1 X Ȳ )/ n + 1 m (n 1)S 2 X +(m 1)SY 2 n+m 2 The square of a t-statistic T 2 is a F-statistic with degrees of freedom 1 and n + m 2. Also note that: (n 1)SX 2 + (m 1)S Y 2 = n i=1 (x i x) 2 + m i=1 (y i ȳ) 2 = SS(E) ( x ȳ) 2 1 n + 1 m T 2 = = n( x = SS(T ) n x + mȳ n x + mȳ n + m )2 + m(ȳ n + m )2 SS(T )/1 = F (2 1, n + m 2) SS(E)/(n + m 2)
29 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) Dependence of real estate prices on districts and ages of the buildings. Assume that there are two factors (attributes), one of which has a levels and the other b levels. X ij is N(µ ij, σ 2 ), i = 1, 2,..., a, and j = 1, 2,..., b; and the n = ab random variables are independent. Assume that the means µ ij are composed of a row effect, a column effect, and an overall effect in some additive way, namely, µ ij = µ + α i + β j, where a i=1 α i = 0 and b j=1 β j = 0. The parameter α i represents the i th row effect, and the parameter β j represents the j th column effect. (a) Test the hypothesis that there is no row effect. Test H A : α 1 = α 2 = = α a = 0, since a i=1 α i = 0. (b) Test the there is no column effect, we would test H B : β 1 = β 2 = = β b = 0, since b j=1 β j = 0.
30 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) Consider the sum of squares: SS(TO) = = = b a i=1 j=1 a i=1 j=1 b (X ij X.. ) 2 b [(X i. X.. ) + (X.j X.. ) + (X ij X i. X.j + X.. )] 2 a (X i. X.. ) 2 + a i=1 + a i=1 j=1 b (X.j X.. ) 2 j=1 b (X ij X i. X.j + X.. ) 2 = SS(A) + SS(B) + SS(E),
31 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) The distribution of the error sum of squares SS(E) does not depend on the mean µ ij, provided that the additive model is correct. Hence, its distribution is the same whether H A or H B is true or not. Noting that X ij X i. X.j + X.. = X ij (X i. X.. ) (X.j X.. ) X.. which is similar to X ij µ ij = X ij α i β j µ. Under both H A and H B are true, we have SS(TO)/σ 2 is χ 2 (ab 1), both SS(A)/σ 2 and SS(B)/σ 2 are chi-square variables, namely, χ 2 (a 1) and χ 2 (b 1), Since SS(E) 0, using Theorem 7.5-1, SS(A), SS(B) and SS(E) are all independent. SS(E) is a chi-square variable with ab 1 (a 1) (b 1) = (a 1)(b 1) degrees of freedom.
32 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) (I) To test H A : α 1 = α 2 = = α a = 0 Consider the F-statistic: F A = SS(A)/[σ 2 (a 1)] SS(E)/[σ 2 (a 1)(b 1)] = SS(A)/(a 1) SS(E)/[(a 1)(b 1)] which has an F distribution with a 1 and (a 1)(b 1) degrees of freedom when H A is true, H A is rejected if the observed value of F A F α [a 1, (a 1)(b 1)]. (II) To test H B : β 1 = β 2 = β b = 0 against all alternatives, F B = SS(B)/[σ 2 (b 1)] SS(E)/[σ 2 (a 1)(b 1)] = SS(B)/(b 1) SS(E)/[(a 1)(b 1)], which has an F distribution with b 1 and (a 1)(b 1) degrees of freedom, provided that H B is true.
33 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) (III) Test for interactions between two factors: particular combinations of the 2 factors might interact differently from what is expected from the additive model. Assume that X ijk, i = 1, 2,, a; j = 1, 2,, b; and k = 1, 2,, c, are n = abc random variables that are mutually independent and have normal distributions with a common, but unknown, variance σ 2. The mean of each X ijk, k = 1, 2,, c, is µ ij = µ + α i + β j + γ ij, where a i=1 α i = 0, b j=1 β j = 0, a i=1 γ ij = 0, and b j=1 γ ij = 0. γ ij is called the interaction associated with cell (i, j). To test the hypotheses that (a) the row effects are equal to zero, (b) the column effects are equal to zero, and (c) there is no interaction
34 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) Using notations X ij. = 1 c X i.. = 1 bc X.j. = 1 ac c X ijk, k=1 X... = 1 abc b j=1 k=1 a i=1 k=1 a c X ijk, c X ijk, b i=1 j=1 k=1 c X ijk,
35 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) We again have the total sum of squares: SS(TO) = a b i=1 j=1 k=1 = bc c (X ijk X... ) 2 a (X i.. X... ) 2 + ac i=1 + c + a i=1 j=1 a b i=1 j=1 k=1 b (X.j. X... ) 2 j=1 b (X ij. X i.. X.j. + X... ) 2 c (X ijk X ij. ) 2 = SS(A) + SS(B) + SS(AB) + SS(E),
36 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) Under the null hypothesis, all the means equal to the same value µ, hence SS(TO)/σ 2 is χ 2 (abc 1). SS(A)/σ 2 and SS(B)/σ 2 are χ 2 (a 1) and χ 2 (b 1). Moreover, for each (i, j), we also have c (X ijk X ij. ) 2 k=1 is χ 2 (c 1); therefore, SS(E)/σ 2 is the sum of ab independent chi-square variables such as this and thus is χ 2 [ab(c 1)]. σ 2 Since SS(AB) 0, using Theorem 7.5-1, SS(A)/σ 2, SS(B)/σ 2, SS(AB)/σ 2, and SS(E)/σ 2 are mutually independent chi-square variables with a 1, b 1, (a 1)(b 1), and ab(c 1) degrees of freedom.
37 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) (i) The statistic for testing the hypothesis H AB : γ ij = 0, i = 1, 2,, a, j = 1, 2,, b, against all alternatives is c a b (X ij. X i.. X.j. + X... ) 2 /[σ 2 (a 1)(b 1)] F AB = = i=1 j=1 a b i=1 j=1 k=1 SS(AB)/[(a 1)(b 1)] SS(E)/[ab(c 1)] c (X ijk X ij. ) 2 /[σ 2 ab(c 1)] which has an F distribution with (a 1)(b 1) and ab(c 1) degrees of freedom when H AB is true. If F AB F α [(a 1)(b 1), ab(c 1)], we reject H AB and say that there is a difference among the means, since there seems to be interaction.,
38 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) (ii) The statistic for testing the hypothesis against all alternatives is H A : α 1 = α 2 = = α a = 0 F A = a bc a (X i.. X... ) 2 /[σ 2 (a 1)] b i=1 i=1 j=1 k=1 c (X ijk X ij. ) 2 /[σ 2 ab(c 1)] = SS(A)/(a 1) SS(E)/[ab(c 1)], which has an F distribution with a 1 and ab(c 1) degrees of freedom when H A is true.
39 Section 7.6 Two-Factor Analysis of Variance (2 way ANOVA) (iii) The statistic for testing the hypothesis against all alternatives is H B : β 1 = β 2 = = β b = 0 F B = ac b (X.j. X... ) 2 /[σ 2 (b 1)] a b j=1 i=1 j=1 k=1 c (X ijk X ij. ) 2 /[σ 2 ab(c 1)] = SS(B)/(b 1) SS(E)/ab(c 1)], which has an F distribution with b 1 and ab(c 1) degrees of freedom when H B is true.
40 Section 7.7 Tests concerning Regression and Correlation Let X and Y have a bivariate normal distribution. Using the sample correlation coefficient to test the hypothesis H 0 : ρ = 0 and also to form a confidence interval for ρ. Let (X 1, Y 1 ), (X 2, Y 2 ),, (X n, Y n ) denote a random sample from a bivariate normal distribution with parameters µ X, µ Y, σx 2, σ2 Y, and ρ. Sample correlation coefficient: R = 1 n 1 n (X i X )(Y i Y ) i=1 = n (X i X ) 2 1 n (Y i Y ) n n 1 i=1 i=1 S XY S X S Y.
41 Section 7.7 Tests concerning Regression and Correlation Note that R S Y S X = S XY S 2 X = 1 n (X i X )(Y i Y ) n 1 i=1 1 n (X i X ) n 1 2 is exactly the solution that we obtained for ˆβ in Secton 6.7. If H 0 : ρ = 0 is true, Y 1, Y 2,, Y n are independent of X 1, X 2,, X n, and thus β = ρσ Y /σ X = 0. The conditional distribution of ˆβ, given X 1 = x 1,, X n = x n : i=1 ˆβ = n (x i x)(y i Y ) i=1 n (x i x) 2 i=1 is N[0, σ 2 Y /(n 1)s2 x ] when s 2 x > 0.
42 Section 7.7 Tests concerning Regression and Correlation Recall from Section 6.7, the conditional distribution of n i=1 [Y i Y (S xy /sx 2 )(x i x)] 2 σy 2 = (n 1)S Y 2 (1 R2 ) σy 2 given that X 1 = x 1,, X n = x n, is χ 2 (n 2) and is independent of ˆβ. When ρ = 0, the conditional distribution of, T = (RS Y /s x )/(σ Y / n 1s x ) = R n 2 (n 1)SY 2 (1 R2 )/σy 2 ][1/(n 2)] 1 R 2 is t with n 2 degrees of freedom. Since the conditional distribution of T given that X 1 = x 1,, X n = x n, does not depend on x 1, x 2,, x n, the unconditional distribution of T must be t with n 2 degrees of freedom, and T and (X 1, X 2,, X n ) are independent when ρ = 0.
43 Section 7.7 Tests concerning Regression and Correlation (Remark) In the discussion about the distribution of T, nothing was said about the distribution of X 1, X 2,, X n. If X and Y are independent and Y has a normal distribution, then T has a t distribution whatever the distribution of X. The roles of X and Y can be reversed in all of this development. T can be used to test H 0 : ρ = 0; if H 1 : ρ > 0, we would use the critical region defined by the observed T t α (n 2), since large T implies large R. The distribution function and p.d.f. of R when 1 < r < 1, provided that ρ = 0: g(r) = Γ[(n 1)/2] Γ(1/2)Γ[(n 2)/2] (1 r 2 ) (n 4)/2, 1 < r < 1. (See Appendix B Table XI)
44 Section 7.7 Tests concerning Regression and Correlation (Proof) G(r) = P(R r) = P = r n 2/ 1 r 2 ( T r ) n 2 1 r 2 h(t) dt ( Γ[(n 1)/2] 1 h(t) = 1 + t2 Γ(1/2)Γ[(n 2)/2] n 2 n 2 The derivative of G(r), with respect to r, is ( ) r n 2 d(r n 2/ 1 r g(r) = h 2 ), 1 r 2 dr ) (n 1)/2 To test the hypothesis H 0 : ρ = 0 against the alternative hypothesis H 1 : ρ 0 at a significance level α, select either a constant r α/2 (n 2) or a constant t α/2 (n 2) so that α = P( R r α/2 (n 2); H 0 ) = P( T t α/2 (n 2); H 0 )
45 Section 7.7 Tests concerning Regression and Correlation To test H 0 : ρ = ρ 0, an approximate test of size α can be obtained by using the fact that W = 1 2 ln 1 + R 1 R has an approximate normal distribution with mean (1/2) ln[(1 + ρ)/(1 ρ)] and variance 1/(n 3) (since R has an asymptotic normal distribution with mean ρ and variance (1 ρ 2 ) 2 n ). A test of H 0 : ρ = ρ 0 can be based on the statistic z = 1 2 ln 1 + R 1 R 1 2 ln 1 + ρ 0 1 ρ 0 1 n 3, which has a distribution that is approximately N(0, 1).
46 Section 7.7 Tests concerning Regression and Correlation An approximate 100(1 α)% confidence interval for ρ, ( (1/2) ln[(1 + R)/(1 R)] (1/2) ln[(1 + ρ)/(1 ρ)] P c 1/(n 3) c ) 1 α. P ( 1 + R (1 R) exp(2c/ n 3) 1 + R + (1 R) exp(2c/ n 3) ρ 1 + R (1 R) exp( 2c/ n 3) 1 + R + (1 R) exp( 2c/ n 3) ) 1 α.
47 The end of Chapter 7
Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationSummary of Chapters 7-9
Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two
More informationINTERVAL ESTIMATION AND HYPOTHESES TESTING
INTERVAL ESTIMATION AND HYPOTHESES TESTING 1. IDEA An interval rather than a point estimate is often of interest. Confidence intervals are thus important in empirical work. To construct interval estimates,
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationProbability and Statistics Notes
Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationMath 628 In-class Exam 2 04/03/2013
Math 628 In-class Exam 2 04/03/2013 Name: KU ID: Note: Show ALL work clearly in the space provided. In order to receive full credit on a problem, solution methods must be complete, logical and understandable.
More informationTheorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2
identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, (1) E(MSE) = E(SSE/[IJ(K 1)]) = (2) E(MSA) = E(SSA/(I
More informationFractional Factorial Designs
k-p Fractional Factorial Designs Fractional Factorial Designs If we have 7 factors, a 7 factorial design will require 8 experiments How much information can we obtain from fewer experiments, e.g. 7-4 =
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Linear Regression Analysis - Chapters 3 and 4 in Dielman Artin Department of Statistical Science September 15, 2009 Outline 1 Simple Linear Regression Analysis 2 Using
More informationChapter 4. Regression Models. Learning Objectives
Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing
More information1 One-way analysis of variance
LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.
More informationChapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests
Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationSTAT 4385 Topic 01: Introduction & Review
STAT 4385 Topic 01: Introduction & Review Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 Outline Welcome What is Regression Analysis? Basics
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationWeek 14 Comparing k(> 2) Populations
Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationNotes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1
Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population
More informationCorrelation analysis. Contents
Correlation analysis Contents 1 Correlation analysis 2 1.1 Distribution function and independence of random variables.......... 2 1.2 Measures of statistical links between two random variables...........
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationChapter 4: Regression Models
Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationLecture 5: ANOVA and Correlation
Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions
More informationChapter 5: HYPOTHESIS TESTING
MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate
More informationIn a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:
Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s
More informationAsymptotic Statistics-III. Changliang Zou
Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (
More informationSTA301- Statistics and Probability Solved Subjective From Final term Papers. STA301- Statistics and Probability Final Term Examination - Spring 2012
STA30- Statistics and Probability Solved Subjective From Final term Papers Feb 6,03 MC004085 Moaaz.pk@gmail.com Mc004085@gmail.com PSMD0 STA30- Statistics and Probability Final Term Examination - Spring
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationStatistics For Economics & Business
Statistics For Economics & Business Analysis of Variance In this chapter, you learn: Learning Objectives The basic concepts of experimental design How to use one-way analysis of variance to test for differences
More informationChap The McGraw-Hill Companies, Inc. All rights reserved.
11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationSolution: First note that the power function of the test is given as follows,
Problem 4.5.8: Assume the life of a tire given by X is distributed N(θ, 5000 ) Past experience indicates that θ = 30000. The manufacturere claims the tires made by a new process have mean θ > 30000. Is
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationAnalysis of Variance
Analysis of Variance Math 36b May 7, 2009 Contents 2 ANOVA: Analysis of Variance 16 2.1 Basic ANOVA........................... 16 2.1.1 the model......................... 17 2.1.2 treatment sum of squares.................
More informationStat 704 Data Analysis I Probability Review
1 / 39 Stat 704 Data Analysis I Probability Review Dr. Yen-Yi Ho Department of Statistics, University of South Carolina A.3 Random Variables 2 / 39 def n: A random variable is defined as a function that
More informationPHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1
PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population
More informationSTA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03
STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationiron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA
iron retention (log) 0 1 2 3 high Fe2+ high Fe3+ low Fe2+ low Fe3+ medium Fe2+ medium Fe3+ 2 Two-way ANOVA In the one-way design there is only one factor. What if there are several factors? Often, we are
More informationChapter 11 - Lecture 1 Single Factor ANOVA
April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationChapter 12 - Lecture 2 Inferences about regression coefficient
Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous
More informationChapter 15: Analysis of Variance
Chapter 5: Analysis of Variance 5. Introduction In this chapter, we introduced the analysis of variance technique, which deals with problems whose objective is to compare two or more populations of quantitative
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationReview of Statistics
Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and
More informationi=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that
Math 47 Homework Assignment 4 Problem 411 Let X 1, X,, X n, X n+1 be a random sample of size n + 1, n > 1, from a distribution that is N(µ, σ ) Let X = n i=1 X i/n and S = n i=1 (X i X) /(n 1) Find the
More informationStatistics, Data Analysis, and Simulation SS 2015
Statistics, Data Analysis, and Simulation SS 2015 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler Mainz, 27. April 2015 Dr. Michael O. Distler
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationRegression Models. Chapter 4. Introduction. Introduction. Introduction
Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager
More informationSTAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.
STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More information557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES
557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES Example Suppose that X,..., X n N, ). To test H 0 : 0 H : the most powerful test at level α is based on the statistic λx) f π) X x ) n/ exp
More information1 Statistical inference for a population mean
1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known
More informationPurposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions
Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Student-t Distribution Chi Square Distribution
More informationCherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants
18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationThe t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary
Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis
More informationTwo or more categorical predictors. 2.1 Two fixed effects
Two or more categorical predictors Here we extend the ANOVA methods to handle multiple categorical predictors. The statistician has to watch carefully to see whether the effects being considered are properly
More informationStatistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts
Statistical methods for comparing multiple groups Lecture 7: ANOVA Sandy Eckel seckel@jhsph.edu 30 April 2008 Continuous data: comparing multiple means Analysis of variance Binary data: comparing multiple
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationCOMPSCI 240: Reasoning Under Uncertainty
COMPSCI 240: Reasoning Under Uncertainty Andrew Lan and Nic Herndon University of Massachusetts at Amherst Spring 2019 Lecture 20: Central limit theorem & The strong law of large numbers Markov and Chebyshev
More informationExercises and Answers to Chapter 1
Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean
More informationHypothesis Testing hypothesis testing approach
Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we
More informationQuestion. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?
Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from
More information2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018
Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationChapter 10: Analysis of variance (ANOVA)
Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationWe need to define some concepts that are used in experiments.
Chapter 0 Analysis of Variance (a.k.a. Designing and Analysing Experiments) Section 0. Introduction In Chapter we mentioned some different ways in which we could get data: Surveys, Observational Studies,
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationMasters Comprehensive Examination Department of Statistics, University of Florida
Masters Comprehensive Examination Department of Statistics, University of Florida May 6, 003, 8:00 am - :00 noon Instructions: You have four hours to answer questions in this examination You must show
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationEvaluating Hypotheses
Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,
More informationF79SM STATISTICAL METHODS
F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationMATH 728 Homework 3. Oleksandr Pavlenko
MATH 78 Homewor 3 Olesandr Pavleno 4.5.8 Let us say the life of a tire in miles, say X, is normally distributed with mean θ and standard deviation 5000. Past experience indicates that θ = 30000. The manufacturer
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationSpace Telescope Science Institute statistics mini-course. October Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses
Space Telescope Science Institute statistics mini-course October 2011 Inference I: Estimation, Confidence Intervals, and Tests of Hypotheses James L Rosenberger Acknowledgements: Donald Richards, William
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationDESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya
DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample
More informationFinding Relationships Among Variables
Finding Relationships Among Variables BUS 230: Business and Economic Research and Communication 1 Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions, hypothesis
More information