5 Introduction to the Theory of Order Statistics and Rank Statistics

Size: px
Start display at page:

Download "5 Introduction to the Theory of Order Statistics and Rank Statistics"

Transcription

1 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order and rank statistics. In particular, results will be presented for linear rank statistics. Many nonparametric tests are based on test statistics that are linear rank statistics. For one sample: The Wilcoxon-Signed Rank Test is based on a linear rank statistic. For two samples: The Mann-Whitney-Wilcoxon Test, the Median Test, the Ansari- Bradley Test, and the Siegel-Tukey Test are based on linear rank statistics. Most of the information in this section can be found in Randles and Wolfe (979). 5. Order Statistics Let X, X,..., X n be a random sample of continuous random variables having cdf F (x) and pdf f(x). Let X (i) be the i th smallest random variable (i =,,..., n). X (), X (),..., X (n) are referred to as the order statistics for X, X,..., X n. By definition, X () < X () < < X (n). Theorem 5.: Let X () < X () < < X (n) be the order statistics for a random sample from a distribution with cdf F (x) and pdf f(x). The joint density for the order statistics is g(x (), x (),..., x (n) ) = n! n f(x (i) ) for < x () < x () < < x (n) < () = 0 otherwise Theorem 5.: The marginal density for the j th order statistic X (j) (j =,,..., n) is g j (t) = n! (j )!(n j)! [F (t)]j [ F (t)] n j f(t) < t <. For random variable X with cdf F (x), the inverse distribution F ( ) is defined as F (y) = inf{x : F (x) y} 0 < y <. If F (x) is strictly increasing between 0 and, then there is only one x such that F (x) = y. In this case, F (y) = x. Theorem 5.3 (Probability Integral Transformation): Let X be a continuous random variable with distribution function F (x). The random variable Y = F (X) is uniformly distributed on (0, ). Let X () < X () < < X (n) be the order statistics for a random sample from a continuous distribution. Application of Theorem 5.3, implies that F (X () ) < F (X () ) < < F (X (n) ) are distributed as the order statistics from a uniform distribution on (0, ). 75

2 Let V j = F (X (j) for j =,,..., n. Then, by Theorem 5., the marginal density for each V j has the form g j (t) = n! (j )!(n j)! tj [ t] n j < t < because F (t) = t and f(t) = for a uniform distribution on (0, ). Thus, V j has a beta distribution with parameters α = j and β = n j +. Therefore, the moments of V j are E(Vj r n! Γ(r + j) ) = (j )! Γ(n + r + ) where Γ(k) = (k )!. Thus, when V j is the j th order statistic from a uniform distribution, E(V j ) = j n + V ar(v j ) = j(n j + ) (n + ) (n + ) Simulation to Demonstrate Theorem 5.3 (Probability Integral Transformation) Case : N(0, ) Distribution. Generate a random sample (x, x,..., x 5000 ) of 5000 values from a normal N(0, ) distribution.. Determine the 5000 empirical cdf F (x i ) values. 3. Plot the histograms and empirical cdf of the original N(0, ) sample. Note how they represent a sample from a standard normal distribution. 4. Plot the histograms and empirical cdf of the F (x i ) values. Note the histograms and empirical cdf of the F (x i ) values represent a sample from a uniform U(0, ) distribution (as supported by Theorem 5.3). Case : Exp(4) Distribution. Generate a random sample (x, x,..., x 5000 ) of 5000 values from an exponential Exp(4) distribution.. Determine the 5000 empirical cdf F (x i ) values. 3. Plot the histograms and empirical cdf of the original Exp(4) sample. Note how they represent a sample from an exponential Exp(4) distribution. 4. Plot the histograms and empirical cdf of the F (x i ) values. Note the histograms and empirical cdf of the F (x i ) values represent a sample from a uniform U(0, ) distribution (as supported by Theorem 5.3). 76

3 Histogram of N(0,) Sample Histogram of CDF of N(0,) Sample) Frequency Frequency x Fx ECDF of N(0,) Sample ECDF(ECDF of N(0,) Sample) Fn(x) Fn(x) Histogram of Exp(4) Sample x Histogram of CDF of Exp(4) Sample) x Frequency Frequency x Fx ECDF of Exp(4) Sample ECDF(ECDF of Exp(4) Sample) Fn(x) Fn(x) x x 77

4 R Code for Simulation of Theorem 5.3 (Probability Integral Transformation) n = 5000 # size of random sample # CASE : Random Samples from N(0,) Distribution x <- rnorm(n,0,) x[:0] # view first 0 values Fx <- pnorm(x) Fx[:0] windows() par(mfrow=c(,)) hist(x,main="histogram of N(0,) Sample") hist(fx,main="histogram of CDF of N(0,) Sample)") plot(ecdf(x),main="ecdf of N(0,) Sample") plot(ecdf(fx),main="ecdf(ecdf of N(0,) Sample)") # CASE : Random Samples from Exponential(4) Distribution x <- rexp(n,4) x[:0] # view first 0 values Fx <- pexp(x,4) Fx[:0] windows() par(mfrow=c(,)) hist(x,main="histogram of Exp(4) Sample") hist(fx,main="histogram of CDF of Exp(4) Sample)") plot(ecdf(x),main="ecdf of Exp(4) Sample") plot(ecdf(fx),main="ecdf(ecdf of Exp(4) Sample)") 5. Equal-in-Distribution Results Two random variables S and T are equal in distribution if S and T have the same cdf. To denote equal in distribution, we write S = d T. Theorem 5.4 A random variable X has a distribution that is symmetric about some number µ if and only if (X µ) = d (µ X). Theorem 5.5 Let X, X,..., X n be independent and identically distributed (i.i.d.) random variables. Let (α, α,..., α n ) denote any permutation of the integers (,,..., n). Then (X, X,..., X n ) = d (X α, X α,..., X αn ). A set of random variables X, X,..., X n (α, α,..., α n ) of the integers,,..., n, is exchangeable if for every permutation (X, X,..., X n ) = d (X α, X α,..., X αn ). If X, X,..., X n are i.i.d random variables, then the set X, X,..., X n is exchangeable. The statistic t( ) is. a translation statistic if t(x + k, x + k,..., x n + k) = t(x, x,..., x n ) + k. a translation-invariant statistic if t(x + k, x + k,..., x n + k) = t(x, x,..., x n ) for every k and x, x,..., x n. 78

5 5.3 Ranking Statistics Let Z, Z,..., Z n be a random sample from a continuous distribution with cdf F (z), and let Z () < Z () < < Z (n) be the corresponding order statistics. Z i has rank R i among Z, Z,..., Z n if uniquely defined. Z i = Z (Ri ) assuming the R th i order statistic is By uniquely defined we are assuming that ties are not possible. That is, Z (i) Z (j) for all i j. Let R = {r : r is a permutation of the integers (,,..., n)}. That is, R is the set of all permutations of the integers (,,..., n). Theorem 5.6 Let R = (R, R,..., R n ) be the vector of ranks where R i is the rank of Z i among Z, Z,..., Z n. Then R is uniformly distributed over R. That is, P (R = r) = /n! for each permutation r. Theorem 5.7 Let Z, Z,..., Z n be a random sample from a continuous distribution, and let R be the corresponding vector of ranks where R i is the rank of Z i for i =,,..., n. Then and, for i j, P [R i = r] = /n for r =,,..., n = 0 otherwise P [R i = r, R j = s] = for r s, r, s =,,..., n n(n ) = 0 otherwise Corollary 5.8 Let R be the vector of ranks corresponding to a random sample from a continuous distribution. Then E[R i ] = n + and V ar[r i ] = Cov[R i, R j ] = (n + ) (n + )(n ) for i j. for i =,,..., n Let V, V,..., V n be random variables with joint distribution function D, where D is a member of some collection A of possible joint distributions. Let T (V, V,..., V n ) be a statistic based on V, V,..., V n. The statistic T is distribution-free over A if the distribution of T is the same for every joint distribution in A. Corollary 5.9 Let Z, Z,..., Z n be a random sample from a continuous distribution, and let R be the corresponding vector of ranks. If V (R) is a statistic based only on R, then V (R) is distribution-free over the class A of joint distributions of n i.i.d. continuous random variables. A statistic (such as V (R)) that is a function of Z, Z,..., Z n only through the rank vector R is called a rank statistic. 79

6 Example of a distribution-free statistic: Let X, X,..., X n and Y, Y,..., Y m be independent random samples from continuous distributions with cdfs F (x) and G(x) = F (x ), respectively ( < < ). That is, is a shift parameter. Combine the X and Y samples. Let R i (i =,,..., n) and Q j (j =,,..., m) be the ranks of the n X-values and the m Y -values in the combined sample. Thus, R i and Q j take on values,,..., (m + n). Thus, the rank vector R = (R, R,..., R n, Q, Q,..., Q m ) is simply a permutation of the integers (,,..., (m + n)) which satisfy the constraint R i + m Q j = j= m+n k= k = (m + n)(m + n + ). To construct a test for H 0 : = 0 vs H : > 0 based on the ranks in rank vector R, we compare the X-ranks (R, R,..., R n ) to the Y -ranks (Q, Q,..., Q m ). If we know the X-ranks (R, R,..., R n ), then we also know the Y -ranks. Thus, it will be sufficient to consider a statistic based only on the X-ranks, say W (R, R,..., R n ). The test statistic proposed by Wilcoxon is W = X-ranks. W is known as a ranksum statistic. R i. That is, W is the sum of the Note that the statistic W is a function of the data only through the rank vector R = (R, R,..., R n, Q, Q,..., Q m ). That is, once we have R, we no longer need (X, X,..., X n, Y, Y,..., Y m ) to calculate W. If H 0 : = 0 is true, then the data X, X,..., X n, Y, Y,..., Y m are i.i.d. continuous random variables. Applying Corollary 5.9, the rank statistic W is distribution-free over the class A of all continuous distributions. That is, for any continuous cdf F A, the distribution of W does not depend on the choice of F. Theorem 5.0: Let W be the rank sum statistic when X, X,..., X n and Y, Y,..., Y m are independent random samples from F (x) and G(y) = F (y ), respectively. If H 0 : = 0 is true, then the discrete distribution of W is given by P 0 [W = w] = t m,n(w) ) for w = ( m+n n = 0 otherwise n(n + ), n(n + ) +,..., n(m + n + ) where t m,n (w) is the number of subsets of n integers selected without replacement from (,,..., (m+ n)) such that their sum = w. Thus, to calculate P 0 [W = w] for a given m and n, we need to (i) generate all ( ) m+n n possible assignments of (m + n) ranks to the X and Y observations, (ii) calculate W for each assignment, and (iii) count the number of cases where W = w. For example consider the case with n = and m = 4. There are ( 6 ) = 5. Thus, there will be two X-ranks (R, R ) from the six possible ranks (,, 3, 4, 5, 6). W = R + R is then calculated for all possible assignments of the 6 ranks. 80

7 The following table shows the 5 assignments of the 6 ranks and the corresponding W statistic values. X-ranks Y -ranks X-ranks Y -ranks R, R Q, Q, Q 3, Q 4 W = R + R R, R Q, Q, Q 3, Q 4 W = R + R 5,6,,3,4,4,3,5,6 6 4,6,,3,5 0,3,4,5,6 5 4,5,,3,6 9,6,3,4,5 7 3,6,,4,5 9,5,3,4,6 6 3,5,,4,6 8,4,3,5,6 5 3,4,,5,6 7,3,4,5,6 4,6,3,4,5 8, 3,4,5,6 3,5,3,4,6 7 For each of the 5 unordered assignments of ranks within samples, there are 4!! = 48 ordered assignments yielding the same W value. Thus, overall there are 6! = 70 = (5)(48) ordered assignments of the 6 ranks. The distribution of W is w P 0 [W = w] /5 /5 /5 /5 3/5 /5 /5 /5 /5 Suppose that W = 9. Then for the test of H 0 : = 0 vs H : > 0 : p value = the probability of getting a test statistic W that is at least 9 = /5 + /5 + /5 = 4/5.7. { } n(n + ) n(n + ) n(m + n + ) Note that w {3, 4,..., } =, +,..., as stated in Theorem 5.0. Theorem 5. Let W = be the ranksum statistic. If H 0 : = 0 is true (i.e. F = G), j= then the distribution of W is symmetric about the value µ = n(m + n + )/ and E 0 [W ] = µ V ar[w ] = mn(m + n + ) Statistics Based on Counting and Ranking Let X, X,..., X n be a random sample from a continuous distribution that is symmetric about value µ. Let Z, Z,..., Z n = (X µ, X µ,..., X n µ). Then Z, Z,..., Z n is a random sample that is symmetric about 0. Define Ψ i = Ψ(Z i ) to be an indicator variable where Ψ(t) = if t > 0 and Ψ(t) = 0 if t 0 8

8 Lemma 5. Let Z be a random variable that is symmetrically distributed about 0. Then the random variables Z and Ψ = Ψ(Z) are stochastically independent. That is, P (Ψ =, Z t) = P (Ψ = )P ( Z t) and P (Ψ = 0, Z t) = P (Ψ = 0)P ( Z t). For random variables Z, Z,..., Z n, the absolute rank of Z i, denoted R + i Z i among Z, Z,..., Z n., is the rank of The signed rank of Z i is Ψ i R + i. Thus, (i) Ψ i = Z i if Z i > 0 and (ii) Ψ i = 0 if Z i 0. A signed rank statistic is a statistic that is a function of Ψ R +, Ψ R +,..., Ψ n R + r. The following theorem establishes properties of the joint distribution of Ψ = (Ψ, Ψ,..., Ψ n ) and R + = (R +, R +,..., R + n ). Theorem 5.3 Let Z, Z,..., Z n be a random sample from a continuous distribution that is symmetric about 0. Then Ψ, Ψ,..., Ψ n, R + are mutually independent. Moreover, each Ψ i is a Bernoulli random variable with p = /, and R + is uniformly distributed over R (the set of all permutations of the integers (,,..., n)). Proof of Theorem Z, Z,..., Z n are are independent because they are a random sample. Lemma 5. implies that Ψ, Z, Ψ, Z,..., Ψ n, Z n are n mutually independent random variables. - Each Ψ i is a Bernoulli random variable with parameter p = P [Z i > 0] = / because Z i is continuous and symmetrically distributed about 0. - The R + is independent of Ψ, Ψ,..., Ψ n because it is a function only of Z, Z,..., Z n. That is, R + does not depend on any Ψ i. - Because R + is a rank vector of n i.i.d. continuous random variables, application of Theorem 5.6 shows that R + is uniformly distributed over R (the set of permutations of the integers (,,..., n). Let A 0 be the set of joint distributions of n i.i.d. continuous random variables that are symmetrically distributed about 0. Corollary 5.4 Let S(Ψ, R + ) be a statistic that depends on Z, Z,..., Z n only through Ψ = Ψ, Ψ,..., Ψ n and R + = (R +, R +,..., R n + ). Then the statistic S( ) is distribution-free over A 0. Proof of Corollary 5.4 This result follows from Theorem 5.3 because Ψ and R + have the same joint distribution for every joint distribution F 0 (Z, Z,..., Z n ) A 0. That is, the joint distribution of Ψ and R + does not depend on the choice of F 0 (Z, Z,..., Z n ) A 0. We will often be interested in functions of Ψ and R + that are symmetric functions of the signed ranks Ψ R +, Ψ R +,..., Ψ n R + n. If this is the case, then the following theorem can help establish the distribution of such a statistic. 8

9 Theorem 5.5 Let Z, Z,..., Z n be a random sample from a continuous distribution that is symmetric about 0. Let Q be the number of positive Zs. For Q = q, let S < S < < S q denote the ordered absolute ranks of those Zs that are positive (i.e., S < S < < S q are the positive signed ranks in numerical order). Then P [Q = q, S = s, S = s,..., S q = s q ] = (/) n for q = 0,,..., n and each of the q tuples (s, s,..., s q ) such that s i is an integer and s < s < < s q n = 0 otherwise Recall: Suppose X, X,..., X n be a random sample from a continuous distribution that is symmetric about µ. Then Z, Z,..., Z n = (X µ, X µ,..., X n µ) is a random sample that is symmetric about 0. Thus, all of the preceding results also apply to the (X i µ) random variables. That is, we can generalize the results to A µ = the class of continuous distributions that are symmetric about µ for any < µ <. Example: Suppose we have a random sample X, X,..., X n from a distribution in A µ. The Wilcoxon signed rank statistic W + is defined as W + = Ψ i R + i. That is, W + is the sum of the signed ranks. To test H 0 : µ = µ 0 vs H : µ > µ 0, we would reject H 0 if W + is too large. That is, we would reject H 0 if the p-value is small (e.g., p-value <.05). So how do we calculate the p-value? Corollary 5. Let W + be the Wilcoxon signed rank statistic for testing H 0 : θ = θ 0. For a random sample of size n, the distribution of W + assuming H 0 is true is P 0 [W + = k] = c n(k) for k = 0,,..., n = 0 otherwise n(n + ) where c n (k) = the number of subsets of integers {,,..., n} for which W + is equal to k. Suppose n = 4. The following table list the 4 combinations of signed ranks and the corresponding W + values. Subset of {,, 3, 4} W + Subset of {,, 3, 4} W + 0 {,3 } 5 {} {,4} 6 {} {3,4} 7 {3} 3 {,,3} 6 {4} 4 {,,4} 7 {,} 3 {,3,4} 8 {,3} 4 {,3,4} 9 {,4} 5 {,,3,4} 0 83

10 Thus, the distribution of W + is k P [W + = k] Suppose the data are (X, X, X 3, X 4 ) = (4.6, 5., 5.6, 5.7), and we want to test H 0 : µ = 5 vs H : µ > 5. Next calculate the deviations from µ 0 = 5. That is, (Z, Z, Z 3, Z 4 ) = (.4,.,.6,.7). and the vector of absolute values is ( Z, Z, Z 3, Z 4 ) = (.4,.,.6,.7). The absolute rank vector R + = (R +, R +, R + 3, R + 4 ) = (,, 3, 4). Ψ i = if Z i > 0 (or equivalently, if X i > 5)), and is 0 otherwise. Thus, (Ψ, Ψ, Ψ 3, Ψ 4 ) = (0,,, ). Therefore the signed rank statistic W + = Ψ i R + i is W + = (0)() + ()() + ()(3) + ()(4) = 8. The p-value is the probability of getting a W + value that is at least 8. Therefore, the p-value = P [W + = 8, 9, or 0] = ( + + )/ = 3/ =.875. Theorem 5.7 The distribution of the Wilcoxon signed rank statistic W + is symmetric about its mean µ W + = [n(n + )/4] if H 0 : µ = µ 0 is true. 5.4 Linear Rank Statistics Earlier we studied the ranksum statistic W = combined sample X, X,..., X n, Y, Y,..., Y m. R i where R i is the rank of X i among a If H 0 : = 0 is true, then the random variables X, X,..., X n, Y, Y,..., Y m are i.i.d, and by Corollary 5.9, W is distribution-free over the class of continuous distributions A. The test statistic W has two important properties:. W maintains the desired α-level over a very broad class of distributions (A).. The power of W is excellent for detecting a shift for many distributions, especially for a medium-tailed distribution (such as the normal or logistic). We now consider a general class of rank statistics (which includes W ). Let R = (R, R,..., R N ) be a vector of ranks. Let a(), a(),..., a(n) and c(), c(),..., c(n) be two sets of n constants. A statistic of the form S = c(i) a(r i ) is called a linear rank statistic. The constants a(), a(),..., a(n) are called the scores, and c(), c(),..., c(n) are called the regression constants. The choice of c(), c(),..., c(n) will depend on the specific testing problem of interest. 84

11 Case I: In two-sample problems R is the rank vector of X, X,..., X n, Y, Y,..., Y m. In general, let R, R,..., R n be the ranks of X, X,..., X n and R n+, R n+,..., R m+n be the ranks of Y, Y,..., Y m. If c(i) = for i =,,..., n (7) = 0 for i = n +, n +,..., m + n then S = m+n c(i) a(r i ) = the ranks of X, X,..., X n. a(r i ) which is the sum of the scores associated with The constants c(i) in (7) are called two-sample regression constants. Case II: For Case I, if we also let a(i) = i for i =,,..., m + n, then S = the ranksum statistic W. The scores a(i) = i are called the Wilcoxon scores. R i which is Case III: It is clear that a different choice of a(), a(),..., a(n) scores for the two-sample problem will yield a test statistic with different properties. Let M = the median of the combined sample X, X,..., X n, Y, Y,..., Y m, and define a(i) = 0 if i m + n + = if i > m + n + Consider S with these a(i) scores and the two-sample regression constants in Case I: S = a(r i ) = the number of X i values larger than the sample median M This S is the linear rank statistic for the two-sample median test, and the scores in (8) are called the median scores. (8) 5.4. Linear Rank Statistics under H 0 In this section, general properties of linear rank statistics will be studied under the null hypothesis where null hypothesis refers to any set of assumptions that will result in the rank vector R being uniformly distributed over R (the set of permutations of the integers,,..., N). In future sections, we will study the null hypothesis for specific testing problems. 85

12 Lemma 5.8 Let a(), a(),..., a(n) be a set of N constants. Then, if R is uniformly distributed over permutation set R, E[a(R i )] = N V ar[a(r i )] = N Cov[a(R i ), a(r j )] = a(i) = a (a(i) a) k= N(N ) (a(i) a) = k= for i =,,..., N N V ar[a(r i)] for i j The proof of Lemma 5.8 involves using Theorem 5.7 and the definitions of E( ), V ar( ), and Cov(, ). Lemma 5.8 is used to establish the mean and variance of a linear rank statistic under the null hypothesis. Theorem 5.9 Let S be a linear rank statistic with regression constants c(), c(),..., c(n) and scores a(), a(),..., a(n). If R is uniformly distributed over R, then where a = (/N) E[S] = N ca and [ N ] [ N ] V ar[s] = (c(i) c) (a(k) a) N k= a(i) and c = (/N) c(i). 5.5 Asymptotic Normality of Rank Statistics (Supplemental) The regression constants c(), c(),..., c(n) are determined by the problem of interest. Thus, we will only place a weak restriction on these constants. The restriction essentially requires that asymptotically no individual c i value is much larger than the other constants. Specifically, the restriction is N (c(i) c) as N (9) max i n (c(i) c) where (/N) c i. This is known as Noether s condition. Let φ be a real-valued function defined on (0, ) that (i) does not depend on N, (ii) can be written as the difference φ = φ i φ of two non-decreasing functions, and (iii) satisfies 0 < 0 [ ] φ(u) φ du < with φ = φ(u)du. A function φ( ) with these properties is called a square integrable score function. 86 0

13 [ ] For a square integrable function, φ(u) φ du = φ (u)du [(φ)]. 0 0 Let φ be a square integrable score function and a(), a(),..., a(n) be scores that satisfy any of the following three conditions: ( ) i (A) a(i) = φ. N + (A) a(i) = N i/n (i i)/n φ(u)du for i =,,..., N. (A3) a(i) = E[φ(U (i) )] where U (i) is the i th order statistic from a random sample of size N from a uniform (0, ) distribution. Let S = Let S + = c(i) a(r i ). c(i) Ψ(i) a(r i ). Theorem 5.0 (Asymptotic Normality of Linear Rank Statistics): Under H 0 for a linear rank statistic S, and assuming Noether s condition and condition A, A or A3, then S E(S) V ar(s) d N(0, ) as N Theorem 5. (Asymptotic Normality of Signed Rank Statistics): Under H 0 for a linear rank statistic S +, and assuming Noether s condition and condition A, A or A3, then S + E(S + ) V ar(s+ ) d N(0, ) as N The linear rank statistics and signed rank statistics discussed in this course all all have asymptotic N(0, ) distributions after standardizing. 87

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing

More information

Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I 1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

More information

Distribution-Free Tests for Two-Sample Location Problems Based on Subsamples

Distribution-Free Tests for Two-Sample Location Problems Based on Subsamples 3 Journal of Advanced Statistics Vol. No. March 6 https://dx.doi.org/.66/jas.6.4 Distribution-Free Tests for Two-Sample Location Problems Based on Subsamples Deepa R. Acharya and Parameshwar V. Pandit

More information

Econometrics II - Problem Set 2

Econometrics II - Problem Set 2 Deadline for solutions: 18.5.15 Econometrics II - Problem Set Problem 1 The senior class in a particular high school had thirty boys. Twelve boys lived on farms and the other eighteen lived in town. A

More information

Nonparametric hypothesis tests and permutation tests

Nonparametric hypothesis tests and permutation tests Nonparametric hypothesis tests and permutation tests 1.7 & 2.3. Probability Generating Functions 3.8.3. Wilcoxon Signed Rank Test 3.8.2. Mann-Whitney Test Prof. Tesler Math 283 Fall 2018 Prof. Tesler Wilcoxon

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Nonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006

Nonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006 Nonparametric Tests Mathematics 47: Lecture 25 Dan Sloughter Furman University April 20, 2006 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 1 / 14 The sign test Suppose X 1, X 2,...,

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

1 Complete Statistics

1 Complete Statistics Complete Statistics February 4, 2016 Debdeep Pati 1 Complete Statistics Suppose X P θ, θ Θ. Let (X (1),..., X (n) ) denote the order statistics. Definition 1. A statistic T = T (X) is complete if E θ g(t

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics Nonparametric or Distribution-free statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

Summary of Chapters 7-9

Summary of Chapters 7-9 Summary of Chapters 7-9 Chapter 7. Interval Estimation 7.2. Confidence Intervals for Difference of Two Means Let X 1,, X n and Y 1, Y 2,, Y m be two independent random samples of sizes n and m from two

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Generalized nonparametric tests for one-sample location problem based on sub-samples

Generalized nonparametric tests for one-sample location problem based on sub-samples ProbStat Forum, Volume 5, October 212, Pages 112 123 ISSN 974-3235 ProbStat Forum is an e-journal. For details please visit www.probstat.org.in Generalized nonparametric tests for one-sample location problem

More information

Statistical Procedures for Testing Homogeneity of Water Quality Parameters

Statistical Procedures for Testing Homogeneity of Water Quality Parameters Statistical Procedures for ing Homogeneity of Water Quality Parameters Xu-Feng Niu Professor of Statistics Department of Statistics Florida State University Tallahassee, FL 3306 May-September 004 1. Nonparametric

More information

Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

More information

Non-parametric Inference and Resampling

Non-parametric Inference and Resampling Non-parametric Inference and Resampling Exercises by David Wozabal (Last update 3. Juni 2013) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend

More information

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Application of Variance Homogeneity Tests Under Violation of Normality Assumption Application of Variance Homogeneity Tests Under Violation of Normality Assumption Alisa A. Gorbunova, Boris Yu. Lemeshko Novosibirsk State Technical University Novosibirsk, Russia e-mail: gorbunova.alisa@gmail.com

More information

A Signed-Rank Test Based on the Score Function

A Signed-Rank Test Based on the Score Function Applied Mathematical Sciences, Vol. 10, 2016, no. 51, 2517-2527 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2016.66189 A Signed-Rank Test Based on the Score Function Hyo-Il Park Department

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 2 Two-Sample Methods 3 2.1 Classic Method...................... 7 2.2 A Two-sample Permutation Test............. 11 2.2.1 Permutation test................. 11 2.2.2 Steps for a two-sample

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Comparison of Two Population Means

Comparison of Two Population Means Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS

More information

Chapter 18 Resampling and Nonparametric Approaches To Data

Chapter 18 Resampling and Nonparametric Approaches To Data Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children

More information

Transition Passage to Descriptive Statistics 28

Transition Passage to Descriptive Statistics 28 viii Preface xiv chapter 1 Introduction 1 Disciplines That Use Quantitative Data 5 What Do You Mean, Statistics? 6 Statistics: A Dynamic Discipline 8 Some Terminology 9 Problems and Answers 12 Scales of

More information

Unit 14: Nonparametric Statistical Methods

Unit 14: Nonparametric Statistical Methods Unit 14: Nonparametric Statistical Methods Statistics 571: Statistical Methods Ramón V. León 8/8/2003 Unit 14 - Stat 571 - Ramón V. León 1 Introductory Remarks Most methods studied so far have been based

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

f (1 0.5)/n Z =

f (1 0.5)/n Z = Math 466/566 - Homework 4. We want to test a hypothesis involving a population proportion. The unknown population proportion is p. The null hypothesis is p = / and the alternative hypothesis is p > /.

More information

Some Observations on the Wilcoxon Rank Sum Test

Some Observations on the Wilcoxon Rank Sum Test UW Biostatistics Working Paper Series 8-16-011 Some Observations on the Wilcoxon Rank Sum Test Scott S. Emerson University of Washington, semerson@u.washington.edu Suggested Citation Emerson, Scott S.,

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two

More information

Relative efficiency. Patrick Breheny. October 9. Theoretical framework Application to the two-group problem

Relative efficiency. Patrick Breheny. October 9. Theoretical framework Application to the two-group problem Relative efficiency Patrick Breheny October 9 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Relative efficiency Suppose test 1 requires n 1 observations to obtain a certain power β, and that test

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Design of the Fuzzy Rank Tests Package

Design of the Fuzzy Rank Tests Package Design of the Fuzzy Rank Tests Package Charles J. Geyer July 15, 2013 1 Introduction We do fuzzy P -values and confidence intervals following Geyer and Meeden (2005) and Thompson and Geyer (2007) for three

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem

Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem Comparison of Power between Adaptive Tests and Other Tests in the Field of Two Sample Scale Problem Chikhla Jun Gogoi 1, Dr. Bipin Gogoi 2 1 Research Scholar, Department of Statistics, Dibrugarh University,

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics Understand Difference between Parametric and Nonparametric Statistical Procedures Parametric statistical procedures inferential procedures that rely

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively.

Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively. 4 Two-Sample Methods 4.1 The (Mann-Whitney) Wilcoxon Rank Sum Test Version 1: Equality of Distributions Assumptions: Given two independent random samples X 1, X 2,..., X n and Y 1, Y 2,..., Y m : 1. The

More information

STAT Section 5.8: Block Designs

STAT Section 5.8: Block Designs STAT 518 --- Section 5.8: Block Designs Recall that in paired-data studies, we match up pairs of subjects so that the two subjects in a pair are alike in some sense. Then we randomly assign, say, treatment

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

ORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES

ORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES ORDER STATISTICS, QUANTILES, AND SAMPLE QUANTILES 1. Order statistics Let X 1,...,X n be n real-valued observations. One can always arrangetheminordertogettheorder statisticsx (1) X (2) X (n). SinceX (k)

More information

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics Chapter 6 Order Statistics and Quantiles 61 Extreme Order Statistics Suppose we have a finite sample X 1,, X n Conditional on this sample, we define the values X 1),, X n) to be a permutation of X 1,,

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

A3. Statistical Inference Hypothesis Testing for General Population Parameters

A3. Statistical Inference Hypothesis Testing for General Population Parameters Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,

More information

Y i = η + ɛ i, i = 1,...,n.

Y i = η + ɛ i, i = 1,...,n. Nonparametric tests If data do not come from a normal population (and if the sample is not large), we cannot use a t-test. One useful approach to creating test statistics is through the use of rank statistics.

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

SOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM

SOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM SOLUTIONS TO MATH68181 EXTREME VALUES AND FINANCIAL RISK EXAM Solutions to Question A1 a) The marginal cdfs of F X,Y (x, y) = [1 + exp( x) + exp( y) + (1 α) exp( x y)] 1 are F X (x) = F X,Y (x, ) = [1

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

One-Sample Numerical Data

One-Sample Numerical Data One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Non-parametric tests, part A:

Non-parametric tests, part A: Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G Distribution free hypothesis tests 1. Classical and distribution-free

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45 Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved

More information

Session 3 The proportional odds model and the Mann-Whitney test

Session 3 The proportional odds model and the Mann-Whitney test Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session

More information

Wilcoxon Test and Calculating Sample Sizes

Wilcoxon Test and Calculating Sample Sizes Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines) Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples. Rebecca Barter March 16, 2015 The χ 2 distribution The χ 2 distribution We have seen several instances

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,

More information

TESTS BASED ON EMPIRICAL DISTRIBUTION FUNCTION. Submitted in partial fulfillment of the requirements for the award of the degree of

TESTS BASED ON EMPIRICAL DISTRIBUTION FUNCTION. Submitted in partial fulfillment of the requirements for the award of the degree of TESTS BASED ON EMPIRICAL DISTRIBUTION FUNCTION Submitted in partial fulfillment of the requirements for the award of the degree of MASTER OF SCIENCE IN MATHEMATICS AND COMPUTING Submitted by Gurpreet Kaur

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Learning Objectives for Stat 225

Learning Objectives for Stat 225 Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:

More information

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl. E X A M Course code: Course name: Number of pages incl. front page: 6 MA430-G Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours Resources allowed: Notes: Pocket calculator,

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

Minimum distance tests and estimates based on ranks

Minimum distance tests and estimates based on ranks Minimum distance tests and estimates based on ranks Authors: Radim Navrátil Department of Mathematics and Statistics, Masaryk University Brno, Czech Republic (navratil@math.muni.cz) Abstract: It is well

More information

MATH Notebook 2 Spring 2018

MATH Notebook 2 Spring 2018 MATH448001 Notebook 2 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH448001 Notebook 2 3 2.1 Order Statistics and Quantiles...........................

More information

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability and Statistics FS 2017 Session Exam 22.08.2017 Time Limit: 180 Minutes Name: Student ID: This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY NONPARAMETRICS Statistical Methods Based on Ranks E. L. LEHMANN University of California, Berkeley With the special assistance of H. J. M. D'ABRERA University of California, Berkeley HOLDEN-DAY, INC. San

More information

Problem 1 (20) Log-normal. f(x) Cauchy

Problem 1 (20) Log-normal. f(x) Cauchy ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5

More information

The Convergence Rate for the Normal Approximation of Extreme Sums

The Convergence Rate for the Normal Approximation of Extreme Sums The Convergence Rate for the Normal Approximation of Extreme Sums Yongcheng Qi University of Minnesota Duluth WCNA 2008, Orlando, July 2-9, 2008 This talk is based on a joint work with Professor Shihong

More information

Bivariate Paired Numerical Data

Bivariate Paired Numerical Data Bivariate Paired Numerical Data Pearson s correlation, Spearman s ρ and Kendall s τ, tests of independence University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Nonparametric Statistics Notes

Nonparametric Statistics Notes Nonparametric Statistics Notes Chapter 5: Some Methods Based on Ranks Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Ch 5: Some Methods Based on Ranks 1

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015 Mann-Whitney Test Mann-Whitney Test Recall that the Mann-Whitney

More information

NAG Library Chapter Introduction. G08 Nonparametric Statistics

NAG Library Chapter Introduction. G08 Nonparametric Statistics NAG Library Chapter Introduction G08 Nonparametric Statistics Contents 1 Scope of the Chapter.... 2 2 Background to the Problems... 2 2.1 Parametric and Nonparametric Hypothesis Testing... 2 2.2 Types

More information

1. Point Estimators, Review

1. Point Estimators, Review AMS571 Prof. Wei Zhu 1. Point Estimators, Review Example 1. Let be a random sample from. Please find a good point estimator for Solutions. There are the typical estimators for and. Both are unbiased estimators.

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

MAT 271E Probability and Statistics

MAT 271E Probability and Statistics MAT 71E Probability and Statistics Spring 013 Instructor : Class Meets : Office Hours : Textbook : Supp. Text : İlker Bayram EEB 1103 ibayram@itu.edu.tr 13.30 1.30, Wednesday EEB 5303 10.00 1.00, Wednesday

More information

Section 3: Permutation Inference

Section 3: Permutation Inference Section 3: Permutation Inference Yotam Shem-Tov Fall 2015 Yotam Shem-Tov STAT 239/ PS 236A September 26, 2015 1 / 47 Introduction Throughout this slides we will focus only on randomized experiments, i.e

More information

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 4 Paired Comparisons & Block Designs 3 4.1 Paired Comparisons.................... 3 4.1.1 Paired Data.................... 3 4.1.2 Existing Approaches................ 6 4.1.3 Paired-comparison

More information