6 Single Sample Methods for a Location Parameter


 Amelia McKenzie
 1 years ago
 Views:
Transcription
1 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually the median) are used. Recall: M is a median of a random variable X if P (X M) = P (X M) =.5. The distribution of X is symmetric about c if P (X c x) = P (X c + x) for all x. For symmetric continuous distributions, the median M = the mean µ. Thus, all conclusions about the median can also be applied to the mean. If X be a binomial ( ) random variable with parameters n and p (denoted X B(n, p)) then n P (X = x) = p x (1 p) n x for x = 0, 1,..., n x where ( ) n = x n! x!(n x)! and k! = k(k 1)(k 2) 2 1. Tables exist for the cdf P (X x) for various choices of n and p. The probabilities and cdf values are also easy to produce using SAS or R. Thus, if X B(n,.5), we have ( ) n P (X = x) = (.5) n x x ( ) n P (X x) = (.5) n k k=0 P (X x) = P (X n x) because the B(n,.5) distribution is symmetric. For sample sizes n > 20 and p =.5, a normal approximation (with continuity correction) to the binomial probabilities is often used instead of binomial tables. Calculate z = (x ±.5).5n.5. Use x+.5 when x <.5n and use x.5 when x >.5n. n The value of z is compared to N(0, 1), the standard normal distribution. For example: P (X x) P (Z z) and P (X x) P (Z z) = 1 P (Z z) 6.1 Ordinary Sign Test Assumptions: Given a random sample of n independent observations The measurement scale is at least nominal. Observations can be classified into 2 nonoverlapping categories whose union exhausts all possibilities. The categories will be labeled + and. 88
2 Hypotheses: The inference involves comparing probabilities P (+) and P ( ) for outcomes + and. (A) Twosided: H 0 : P (+) = P ( ) vs H 1 : P (+) P ( ) (B) Upper onesided: H 0 : P (+) P ( ) vs H 1 : P (+) < P ( ) (C) Lower onesided: H 0 : P (+) P ( ) vs H 1 : P (+) > P ( ) Note: H 0 is true only if P (+) = P ( ) =.5 Method: For a given α Let T + = the number of + observations. Let T = the number of observations. If H 0 is true, then we would expect T + and T to be nearly equal ( n/2). In other words, if H 0 is true, T + and T are binomial B(n,.5) random variables. For alternative hypothesis (A) H 1 : P (+) P ( ). Let T = min(t +, T ). Then find the largest t such that B(n,.5) probability P (X t) α/2. (B) H 1 : P (+) < P ( ). Let T = T +. Then find the largest t such that B(n,.5) probability P (X t) α. (C) H 1 : P (+) > P ( ). Let T = T. Then find the largest t such that B(n,.5) probability P (X t) α. Decision Rule For (A), (B), or (C), if T is too small, then we will reject H 0. That is, If T t, Reject H 0. If T > t, Fail to Reject H 0. Large Sample Approximation 1. For the onesided H 1, calculate z = T n.5 n z = T +.5.5n.5 n for (B) if T + <.5n z = T +.5.5n.5 n for (C) if T <.5n z = T.5.5n.5 n for (B) if T + >.5n for (C) if T >.5n 2. For the twosided H 1, take the smaller of the two zvalues in (1.). 3. Find Φ(z) = P (Z z) from the standard normal distribution. 4. Reject H 0 if (i) if P (Z z) α for either 1sided test or (ii) P (Z z) α/2 for the 2sided test. 89
3 90 44
4 Example: (From Gibbons, Nonparametric Methods for Quantitative Analysis). An oil company is considering the following procedures for training prospective service station managers: 1. Onthejob training under actual working conditions for three months. 2. A companyrun school training program concentrated over one month. They plan to compare the two procedures in an experiment. No training program can be the only determining factor for the success of a manager. Success is also affected by other factors such as age, intelligence, and previous experience. In order to eliminate the effects of these factors as much as possible, each trainee is matched with another trainee that has similar attributes (such as similar age and previous experience). If a good match does not exist for a trainee, then the trainee is not included in the experiment. Once pairs are determined, one member of each pair is randomly selected to receive the onthejob training, while the other is assigned to the company school. After completing the assigned training program, the personnel manager assesses each trainee and judges which member of each pair has done a better job of managing the service station. In total, 13 pairs had completed the training programs. The personnel manager stated that for 10 of the 13 pairs, the better manager received the company school training. Is there sufficient evidence to claim that the companyrun school training program is more effective? Table of Binomial Probabilities and Binomial CDF for n=13, p=.5 n p x f(x) = Pr(X=x) F(x) = Pr(X<=x) < <.025 < > Sign (Binomial) Test for Location Assumptions: Given a random sample of n independent observations x 1, x 2,..., x n : The variable of interest is continuous, and the measurement scale is at least ordinal. Hypotheses: The inference concerns a hypothesis about the median M of a single population. (A) Twosided: H 0 : M = M o vs H 1 : M M o (B) Upper onesided: H 0 : M = M o vs H 1 : M > M o (C) Lower onesided: H 0 : M = M o vs H 1 : M < M o 91
5 Method: For a given α Let T + = the number of observations > M o. Let T = the number of observations < M o. Delete any x i = M o and adjust the sample size n accordingly. If H 0 is true, then T + and T are binomial B(n,.5) random variables. Thus, we would expect T + and T to be approximately equal ( n/2). For alternative hypothesis (A) H 1 : M M o. Let T = min(t +, T ). Then find the largest t such that B(n,.5) probability P (X t) α/2. (B) H 1 : M > M o. Let T = T. Then find the largest t such that B(n,.5) probability P (X t) α. (C) H 1 : M < M o. Let T = T +. Then find the largest t such that B(n,.5) probability P (X t) α. Perform the Ordinary Sign Test based on T and t. Decision Rule For (A), (B), or (C), if T is too small, then we will reject H 0. That is, If T t, Reject H 0. If T > t, Fail to Reject H 0. Large Sample Approximation Same as for the Ordinary Sign Test. Example 2.1 from Applied Nonparametric Statistics by W. Daniel. In a study of heart disease, a researcher measured the blood s transit time in subjects with healthy right coronary arteries. The median transit time was 3.50 seconds. In another study, the researchers repeated the transit time study but on a sample of 11 patients with significantly blocked right coronary arteries. The results (in seconds) were Can these researchers conclude (using α =.05) that the median transit time in the population of patients with significantly blocked right coronary arteries is different than 3.50 seconds? 2. Can these researchers conclude (α =.05) that the median transit time in the population of patients with significantly blocked right coronary arteries is less than 3.50 seconds? 92
6 Table of Binomial Probabilities and Binomial CDF for n=10, p=.5 n p x f(x) = Pr(X=x) F(x) = Pr(X<=x) < < < Special Case: Paired Data Assumptions: Given a random sample of n independent pairs of observations (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) : Both variables X and Y are continuous, and the measurement scales are at least ordinal. Testing Procedure: Calculate all differences D i = y i x i for i = 1,..., n. Use the median difference M D in the hypotheses. Typically, M D = 0. Run the Sign Test based on the differences (the D i values). Example 4.1 from Applied Nonparametric Statistics by W. Daniel. Researchers studied the effects of togetherness on the heart rate in rats. They recorded the heart rates of 10 rats while they were alone and while in the presence of another rat. The results are shown below. Using an α =.05 significance level for the Sign Test, can we conclude that togetherness increases the heart rate in rats? For this data, the ten D i values are
7 6.2.2 Sign Test Examples using R and SAS R Output for Sign (Binomial Test Examples) > # Sign Test Example from Gibbons > binom.test(10,13) Exact binomial test number of successes = 10, number of trials = 13, pvalue = < Fail to reject alternative hypothesis: true probability of success is not equal to percent confidence interval: < The CI contains.5 so we fail to reject Ho sample estimates: probability of success > # Sign (Binomial) Test for Location  Daniel Ex. 2.1 > time < c(1.80,3.30,5.65,2.25,2.50,3.50,2.75,3.25,3.10,2.70,3.00) > time = time > time [1] > ties = sum(time==0) > ties [1] 1 > binom.test(sum(time>0),length(time)ties) Exact binomial test Reject Ho ^^^ number of successes = 1, number of trials = 10, pvalue = alternative hypothesis: true probability of success is not equal to percent confidence interval: <.5 is not in the CI, so reject Ho sample estimates: probability of success 0.1 > # Sign (Binomial) Test for Location  Paired Data, Daniel Ex. 4.1 > alone < c(463,462,462,456,450,426,418,415,409,402) > together < c(523,494,461,535,476,454,448,408,470,437) > diff < together  alone > ties = sum(diff==0) > ties [1] 0 > binom.test(sum(diff>0),length(diff)ties) 94
8 Exact binomial test Fail to reject Ho ^^^^^^^ number of successes = 8, number of trials = 10, pvalue = alternative hypothesis: true probability of success is not equal to percent confidence interval: sample estimates: probability of success 0.8 R Code for Sign (Binomial Test Examples) # Sign Test Example from Gibbons binom.test(10,13) # Sign (Binomial) Test for Location  Daniel Ex. 2.1 time < c(1.80,3.30,5.65,2.25,2.50,3.50,2.75,3.25,3.10,2.70,3.00) time = time time ties = sum(time==0) ties binom.test(sum(time>0),length(time)ties) # Sign (Binomial) Test for Location  Paired Data, Daniel Ex. 4.1 alone < c(463,462,462,456,450,426,418,415,409,402) together < c(523,494,461,535,476,454,448,408,470,437) diff < together  alone ties = sum(diff==0) ties binom.test(sum(diff>0),length(diff)ties) SAS Output for Sign (Binomial) Test Examples In SAS, the Sign (Binomial) Test statistic is denoted M, and it represents the deviation in the observed count T + from the expected count.5n when the null hypothesis is true. Ordinary Sign Test for Training Program Example The UNIVARIATE Procedure Variable: level Tests for Location: Mu0=0 Test Statistic p Value Student s t t Pr > t Sign M 3.5 Pr >= M < Fail to Signed Rank S Pr >= S reject Ho 95
9 Sign (Binomial) Test for Example 2.1 Variable: diff Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Tests for Location: Mu0=0 Test Statistic p Value Student s t t Pr > t Sign M 4 Pr >= M < Reject Signed Rank S Pr >= S Ho Sign (Binomial) Test for Paired Differences  Example 4.1 Obs alone together diff Variable: diff Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Tests for Location: Mu0=0 Test Statistic p Value Student s t t Pr > t Sign M 3 Pr >= M < Fail to Signed Rank S 24.5 Pr >= S reject Ho 96
10 SAS Code for Sign (Binomial) Test Examples DM LOG; CLEAR; OUT; CLEAR; ; OPTIONS NODATE NONUMBER LS=76 PS=54; ****************************************************************; *** Ordinary Sign Test: Let 1,1 represent the 2 categories. ***; *** The frq values are the category frequencies ***; ****************************************************************; DATA in; INPUT level frq LINES; DATA signtest (DROP=i); SET in; IF level = 1 THEN DO i = 1 TO frq; OUTPUT; end; IF level = 1 THEN DO i = 1 TO frq; OUTPUT; end; PROC UNIVARIATE DATA=signtest; VAR level; TITLE Ordinary Sign Test for Training Program Example ; ******************************************; *** Sign (Binomial) Test for Location: ***; **** Example 2.1 in course notes ***; ******************************************; DATA in2; med_time = 3.50; INPUT time diff = time  med_time; OUTPUT; LINES; PROC UNIVARIATE DATA=in2; VAR diff; TITLE Sign (Binomial) Test for Example 2.1 ; RUN; ****************************************************; *** Sign (Binomial) Test for Paired Differences: ***; *** Example 4.1 in course notes ***; ****************************************************; DATA in3; INPUT alone together diff = together  alone; OUTPUT; LINES; ; PROC PRINT DATA=in3; TITLE Sign (Binomial) Test for Paired Differences  Example 4.1 ; PROC UNIVARIATE DATA=in3; VAR diff; RUN; 97
11 6.3 Wilcoxon Signed Rank Test Assumptions: Given a random sample of n independent observations X 1,..., X n : Each X i was drawn from a symmetric and continuous population. Each X i has the same median M for i = 1,..., n). The measurement scale is at least on the interval scale. Hypotheses: The inference concerns a hypothesis about the median M of a single population. Given M o, a hypothesized value of the median, we have: (A) Twosided: H 0 : M = M o vs H 1 : M M o (B) Lower onesided: H 0 : M = M o vs H 1 : M < M o (C) Upper onesided: H 0 : M = M o vs H 1 : M > M o Because of the symmetry assumption, we can replace the median M with the mean µ in the hypotheses. Method: For a given α Calculate all differences D i = X i M o. Remove all cases having D i = 0 and adjust the sample size n accordingly. Assign ranks 1, 2,..., n to the D i. For tied D i values, assign average ranks. If H 0 is true, then the D i are symmetrically distributed about 0. That is, we expect (Ranks where Di > 0) (Ranks where D i < 0). Let T + = (R i when D i > 0) and T = (R i when D i < 0). Under H 0, the sampling distributions of T + and T are symmetric about n(n + 1)/4 and can assume integer values from 0 to n(n + 1)/2. Note that T = n(n + 1)/2 T +. For alternative hypothesis: (A) (B) (C) H 1 : M M o, let T = min(t +, T ). Let w be the largest value from the Wilcoxon Signed Rank Test Table such that P (T w) α/2. H 1 : M < M o, let T = T +. Let w be the largest value from the Wilcoxon Signed Rank Test Table such that P (T w) α. H 1 : M > M o, let T = T. Let w be the largest value from the Wilcoxon Signed Rank Test Table such that P (T w) α. Decision Rule If T w, Reject H 0. If T > w, Fail to Reject H 0. 98
12 Large Sample Approximation (with continuity correction) (n > 30) Computer packages like R and SAS calculate approximate pvalues for the twosided alternative H 1 based on large sample normal distribution approximations. The normalizing formula is: T T n(n + 1)/4 = = T E(T ). n(n + 1)(2n + 1)/24 V ar(t ) Daniel (Applied Nonparametric Statistics, page 42) describes an adjustment to this formula in the event of ties. Example of Wilcoxon Signed Rank Test A random sample of 12 fish was taken and the bodyweights recorded. Test the null hypothesis H 0 : µ = 3.0 against the alternative H 1 : µ < 3.0 pounds. (Assume they are sampled from the same symmetric distribution.) R code for Wilcoxon Signed Rank Test with Confidence Interval # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data fish < c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) wilcox.test(fish,mu=3,conf.int=true) R output for Wilcoxon Signed Rank Test with Confidence Interval > # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data > fish < c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) > wilcox.test(fish,mu=3,conf.int=true) Wilcoxon signed rank test with continuity correction data: fish V = 8, pvalue = alternative hypothesis: true location is not equal to 3 95 percent confidence interval: sample estimates: (pseudo)median SAS code and selected output: SIGN TEST AND WILCOXON SIGNED RANK TEST The UNIVARIATE Procedure Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range
13 Tests for Location: Mu0=0 Test Statistic p Value Student s t t Pr > t Sign M 3 Pr >= M Signed Rank S 31 Pr >= S < pvalue The Signed Rank statistics S in SAS = T (n(n + 1)/4 = 8 (12)(13)/4 = 8 39 = 31. The pvalue is based on the normal approximation and is for a twosided alternative. Thus, for onesided H 1 : M < 3.0, the approximate pvalue =.0112/2 = OPTIONS LS=72 PS=60 NONUMBER NODATE; DATA IN; INPUT X X=X3; CARDS; PROC UNIVARIATE DATA=IN; VAR X; TITLE ONE SAMPLE TESTS FOR LOCATION: ; TITLE2 SIGN TEST AND WILCOXON SIGNED RANK TEST ; RUN; Reference Distribution for the Signed Rank Test (n = 5) If H 0 is true, then any random D i has a probability 1/2 of being > M o or < M o. Without loss of generality, let D 1, D 2, D 3, D 4, D 5 be ordered from smallest to largest. Then when n = 5, every possible ranking of D 1, D 2, D 3, D 4, D 5 has a (1/2) 5 = 1/32 chance of occurring. Possible Ranks Cumulative T + With D i > 0 Probability Probability 0 None 1/32 1/32 = /32 2/32 = /32 3/32 = or 1,2 2/32 5/32 = or 1,3 2/32 7/32 = or 1,4 or 2,3 3/32 10/32 = ,5 or 2,4 or 1,2,3 3/32 13/32 = ,5 or 3,4 or 1,2,4 3/32 16/32 = ,5 or 1,2,5 or 1,3,4 3/32 19/32 = ,5 or 1,3,5 or 2,3,4 3/32 22/32 = ,4,5 or 2,3,5 or 1,2,3,4 3/32 25/32 = ,4,5 or 1,2 3,5 2/32 27/32 = ,4,5 or 1,2 4,5 2/32 29/32 = ,3,4,5 1/32 30/32 = ,3,4,5 1/32 31/32 = ,2,3,4,5 1/32 32/32 = 1 100
14 101 56
15 57102
16 6.3.2 Special Case: Paired Data Assumptions: Given a random sample of n pairs of observations (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ). Let D i = y i x i for i = 1,..., n. The D i s are independent. The measurement scale is at least on the interval scale. The distribution of the differences D i = y i x i for i = 1,..., n is symmetric. Testing Procedure: Calculate all differences D i = y i x i for i = 1,..., n. Use the median difference M D in the hypotheses. Typically, M D = 0. Because of the symmetry assumption, we can replace the median M D with the mean µ D in the hypotheses. Run the Wilcoxon Signed Test based on the D i. Example of Wilcoxon Signed Rank Test for Paired Data Two judges were asked to independently rate the rehabilitative potential for each of 22 male prison inmates. The following table contains the ratings: Inmate (i) Judge 1 Judge 2 D i D i Sign R i remove tie remove tie
17 SAS code and output: DATA IN; DO INMATE=1 TO 22; INPUT JUDGE1 JUDGE2 DIFF = JUDGE1  JUDGE2; OUTPUT; END; CARDS; ; PROC UNIVARIATE DATA=IN; VAR DIFF; TITLE WILCOXON SIGNED RANK TEST FOR PAIRED DATA ; RUN; ==================================================================== WILCOXON SIGNED RANK TEST FOR PAIRED DATA The UNIVARIATE Procedure Variable: DIFF Tests for Location: Mu0=0 Test Statistic p Value Student s t t Pr > t Sign M 5 Pr >= M Signed Rank S 75 Pr >= S < Reject Ho The approximate pvalue is Thus, we would reject the null hypothesis H o : M D = Confidence Interval for the Median Based on the Wilcoxon Signed Rank Test To find the point estimate for the median M: Calculate all paired averages u ij allowing replication: u ij = x i + x j. 2 There are ( ) ( n 2 + n = n+1 ) 2 such averages. Arrange the u ij in increasing order. The point estimate for M is M = the median of the {u ij }. Method: For an approximate confidence level 100(1 α)% : Use the Wilcoxon Signed Rank Test Table to find the largest t such that P (T t) α/2. Let M L = (t + 1) st u ij observation from the beginning and M U = (t + 1) st u ij observation from the end of the set of ordered u ij values. Statistically, P (M L M M U ) = P (t + 1 T n(n + 1)/2 t) where T is the Wilcoxon signed rank statistic. The approximate 100(1 α)% confidence interval is (M L, M U ). The exact confidence level for (M L, M U ) is determined by the distribution given in the Wilcoxon Signed Rank Test Table. That is, if p = P (X t), then (M L, M U ) is an exact 100(1 2p)% confidence interval for M. 104
18 Note: You do not need to calculate all of the u ij values, but only the (t + 1) st largest and smallest. This procedure is also known as the HodgesLehmann estimates of shift. Example of HodgesLehmann confidence interval for M A random sample of 12 fish was taken and the body weights were recorded Calculate an approximate 95% confidence interval for the median bodyweight M. For n = 12, the largest value in the Wilcoxon Signed Rank Test Table with P (T t).025 is t = 13. Note: P (T 13) = Then t + 1 = 14 and n(n + 1)/2 t = = 65. Now find the 14 th and 65 th values in the list of the 78 u ij values. HODGESLEHMANN CONFIDENCE INTERVAL Obs x1 x2 u Obs x1 x2 u < lower endpoint < upper endpoint Thus, the confidence interval is (2.42,2.93). 105
19 R code for Wilcoxon Signed Rank Test with Confidence Interval # Wilcoxon Signed Rank Test with Confidence Interval for Fish Data fish < c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) wilcox.test(fish,mu=3,conf.int=true) R output for Wilcoxon Signed Rank Test with Confidence Interval Wilcoxon signed rank test with continuity correction data: fish V = 8, pvalue = alternative hypothesis: true location is not equal to 3 95 percent confidence interval: < Approximate 95% confidence interval for mu is (2.42, 2.93) sample estimates: (pseudo)median Asymptotic Relative Efficiency (A.R.E.) One way to compare properties of statistical tests is to compare the efficiency properties. The definition of efficiency can vary but, generally speaking, it is used to compare the sample size required of one test with that of another test under similar conditions. Suppose that two tests may be used to test a particular H 0 against a particular H 1, and both tests have the same specified α and β errors. These tests are therefore comparable under conditions related to the level of significance α and power (1 β). Thus, the test requiring the smaller sample size to satisfy these conditions will have the smaller sampling cost and effort. That is, the test with the smaller required sample size is more efficient than the other test, and its relative efficiency is greater than one. Let T 1 and T 2 represent two tests that test the same H 0 against the same H 1 with the same specified α and β values. For example, T 1 is the Sign Test and T 2 is the Wilcoxon Signed Rank Test which are used to test H 0 : µ = µ 0 with α =.05 and power 1 β =.90. The relative efficiency of test T 1 with respect to test T 2 is the ratio n 2 /n 1, where n 1 is the required sample size of T 1 to equal the power of test T 2 which has sample size n 2 (assuming the same H 0 and significance level α). Thus, there is a relative efficiency of T 1 with respect to T 2 for each choice of α and n 2. A more general measure of efficiency (asymptotic relative efficiency) was developed. Consider the situation of letting sample size n 1 increase for T 1 with specified α and β. Then there exists a sequence of n 2 values, such that for each value of n 1 (n 1 ), T 2 has the same α and β values. In other words, there is a sequence of relative efficiency values n 2 /n 1. If n 2 /n 1 approaches a constant value as n 1, and, if that constant is the same for all choices of α and β, then the constant is called the asymptotic relative efficiency of T 1 with respect to T
20 Note that if the A.R.E. exists for T 1 and T 2, then the limiting A.R.E. value is independent of the choice of α and β. To select a test with superior power, we generally select the test with the greatest A.R.E. because the power depends on many factors such as the maximum number of observations that can be collected given experimental or sampling resources and the type of distribution that generates the data (normal?, weibull?, gamma?,...) which is usually unknown. The A.R.E. is, in general, difficult to calculate. In this course, we will only consider A.R.E. results for various pairs of tests and for several choices of distributions. Note that A.R.E. assumes that an infinite sample size can be taken. Thus, a natural question arises: How good is a measure assuming an infinitely large sample when most practical situations involve relatively small sample sizes? In an attempt to answer this question, studies of exact relative efficiency values for very small samples have shown that A.R.E. provides a good approximation to the relative efficiency in many situations of practical interest A.R.E. Comparison for Three SingleSample Tests of Location We will compare the ttest, Sign test, and the Wilcoxon Signed Rank test using the A.R.E. values. To do this we will consider three situations involving symmetric distributions. Under symmetry assumptions, H 0 and H 1 are identical for all three tests. (I) The sample was randomly sampled from a normal distribution having density function ] 1 f(x; µ, σ) = [ σ 2π exp (x µ)2 for < x < 2σ 2 Without loss of generality, we can assume it is a standard normal N(0, 1) having density function φ(x) = 1 2π exp ( x 2 /2 ) for < x < (II) The sample was randomly sampled from a uniform distribution having density function: f(x; a, b) = 1 (b a) for a < x < b = 0 otherwise Without loss of generality, we can assume a uniform U(0, 1) having density function f(x) = 1 for 0 < x < 1 = 0 otherwise The uniform distribution is considered a lighttailed symmetric distribution. 107
21 f(x) = 1 for 0 < x < 1 = 0 otherwise The uniform distribution is considered a lighttailed symmetric distribution. (III) The sample was randomly sampled from a double exponential distribution (DE) having Thedensity samplefunction was randomly sampled from a double exponential distribution (DE) hav (III) ing f(x; a, b) = 1 ( ) 2b exp x a for < x < f(x; a, b) = 1 ( ) 2b exp x a for < x < b Without loss of generality, we can assume DE(0, 1) having density function Without loss of generality, assume it is DE(0, 1) having density function f(x) f(x) = exp ( x ) 1 exp ( x ) for 2 for < x < The DE distribution is is a heavytailed symmetric distribution. Table of A.R.E. Values 61 Test (I) Normal (II) Uniform (III) Double Exponential Comparison Distribution Distribution Distribution Sign test 2/π / / vs ttest (t) (t) (Sign) Wilcoxon Signed Rank test 3/2 = /4 = vs Sign test (Wlcxn) (Wlcxn) (Sign) Wilcoxon Signed Rank test 3/π /2 = vs ttest (t) (=) (Wlcxn) Boldface letters indicate which test is more efficient 6.5 Introduction to the OneSample Randomization Test Paired Data Example: An experimental drug was tested on 7 subjects. Blood level measurements were taken before (X) and after (Y ) administering the drug. In this situation, we have paired data. The difference (D = Y X) in blood level measurements for each subject were: Di Patient i D Difference D i The goal is to test the hypothesis that there is no change in blood level measurement after taking the drug. Statistically, we will assume that the distribution of the difference in blood levels measurements under the null hypothesis H 0 is symmetric about
22 Thus, H 0 : µ D = 0. We will consider two possible alternatives: (1) H 1 : µ D 0 and (2) H 1 : µ D < 0. If H 0 is true (and assuming symmetry about 0), the signs (+ or ) of the 7 measurements can be considered random. For example, we could have just as likely observed Di Patient i D Difference D i or Di Patient i D Difference D i or any other randomization of the signs. The Randomization Reference Distribution 1. Consider all possible sign assignments or randomizations of signs for the seven differences. 2. Calculate D i for each randomization. In this example, there are 2 7 = 128 different randomizations of the seven signs. This yields the randomization distribution of D i. In terms of testing, it is statistically equivalent to use the randomization distribution of the mean D. 3. Now compare the OBSERVED D i =.664 to the randomization distribution to find the probability (pvalue) associated with the test H 0 : µ D = 0 against the alternative H 1 hypothesis. Case 1: For alternative H 1 : µ D 0, from the randomization reference distribution we see the pvalue = P ( D i.664 ) = P (D i.664 ) + P (D i.664 ) = (6 + 6)/128 = Case 2: For alternative H 1 : µ D < 0, from the randomization reference distribution we see the pvalue = P (D i.664 ) = 6/128 =
23 RANDOMIZATION TEST #1 Obs ID1 ID2 ID3 ID4 ID5 ID6 ID7 D_SUM CDF < RANDOMIZATION TEST #1 Obs ID1 ID2 ID3 ID4 ID5 ID6 ID7 D_SUM CDF <
24 SAS Code to Generate the Randomization Distribution for the Change in Blood Level Measurements DM LOG;CLEAR;OUT;CLEAR; ; OPTIONS LS=72 PS=68 NONUMBER NODATE; DATA IN; INPUT D1D7 CARDS; DATA IN; SET IN; DO I1=1 TO 1 BY 2; ID1=I1*D1; DO I2=1 TO 1 BY 2; ID2=I2*D2; DO I3=1 TO 1 BY 2; ID3=I3*D3; DO I4=1 TO 1 BY 2; ID4=I4*D4; DO I5=1 TO 1 BY 2; ID5=I5*D5; DO I6=1 TO 1 BY 2; ID6=I6*D6; DO I7=1 TO 1 BY 2; ID7=I7*D7; D_SUM = SUM(OF ID1ID7); OUTPUT; END; END; END; END; END; END; END; KEEP ID1ID8 D_SUM; PROC SORT DATA=IN; BY D_SUM; DATA IN; SET IN; CDF=_N_/128; PROC PRINT DATA=IN; TITLE RANDOMIZATION TEST #1 ; RUN; Randomization Test for Paired Data Assumptions: Given a random sample of n pairs of observations (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ). The differences D i = y i x i are independent. The distribution of each D i is symmetric and has the same mean. The measurement scale for the D i s is at least interval. Hypotheses: The inference concerns a hypotheses about whether or not the mean difference µ D = 0: (A) Twosided: H 0 : µ D = 0 vs H 1 : µ D 0 (B) Lower onesided: H 0 : µ D = 0 vs H 1 : µ D < 0 (C) Upper onesided: H 0 : µ D = 0 vs H 1 : µ D > 0 Because of the symmetry assumption, we can replace the mean difference µ D with the median difference M D in the hypotheses. Method: For a given α Calculate the sum D i for each of the possible 2 n sign randomizations. Order these values to form the randomization distribution for the D i. 111
25 Decision Rule For (A) H 1 : µ D 0, If ACTUAL D i < 0, let D = D i. If ACTUAL D i > 0, let D = D i. For (B) H 1 : µ D < 0, let D = D i. For (C) H 1 : µ D > 0, let D = D i. p value = 2 Number of D is D 2 n p value = Number of D is D 2 n p value = Number of D is D 2 n = Number of D i s D 2 n Reject H 0 if pvalue α. Otherwise, we fail to reject H 0. Without loss of generality, you can replace D i with D in the preceding arguments. Note that the number of sign randomizations (2 n ) forming the randomization distribution grows rapidly. For example, when n = 20, there are over 1 million randomizations. In such cases, it is generally not feasible to generate the entire randomization distribution. To handle this problem, a large number of randomizations of the signs are randomly taken. Then, an approximate randomization distribution is generated from this large subset of possible randomizations. Approximate pvalues can then be determined from this distribution. This is known as the montecarlo approach to generating approximate pvalues. The following R code will generate pvalues using the montecarlo approach for the singlesample Randomization Test for Location. R Code for Randomization Test on Paired Data (Differences) # Single Sample Randomization Test for Location # Enter the number of permutations to take Prep = Prep # Enter vector of differences D < c(.187,.011,.250,.034,.137,.112,.023) D # Calculate the mean difference meand < mean(d) meand sgnd < sign(meand) Fp = 0 112
26 upper < 0 lower < 0 n = length(d) # Begin sign randomizations meanpermd < 1:Prep for (i in 1:Prep){ sgnvec < sign(runif(n).5) permvec < sgnvec*d # random vector with 1 or 1 values } # Calculate the mean difference for the i_th randomization vector meanpermd[i] < mean(permvec) if(meanpermd[i]>=meand) upper = upper+1 if(meanpermd[i]<=meand) lower = lower+1 # Calculate pvalues: # for lower onesided Ho pval_lower < lower/prep pval_lower # for upper onesided Ho pval_upper < upper/prep pval_upper # for twosided Ho if(sgnd < 0) pval_two_sided = if(sgnd > 0) pval_two_sided = pval_two_sided 2*pval_lower 2*pval_upper hist(meanpermd) R Output for Randomization Test on Paired Data > meand > # Calculate pvalues: > # for lower onesided Ho [1] > # for upper onesided H0 [1] > # for twosided Ho [1] Note that the pvalues from the montecarlo approach (.04644,.96076,.09288) approximate the exact pvalues of ( , ,.09375) from the true randomization distribution. Without loss of generality, I used D instead of D i in my R code. 113
27 Histogram ofhistogram 50,000 Values of meanpermd using the MonteCarlo Approach Frequency meanpermd Single Sample Randomization Test for H o : µ = µ 0 Suppose we want to perform a randomization test if our inference concerns a hypotheses about whether or not µ = µ0 for some specified value µ 0 against one of three alternatives: (A) Twosided: H 0 : µ = µ 0 vs H 1 : µ µ 0 (B) Lower onesided: H 0 : µ = µ 0 vs H 1 : µ < µ 0 (C) Upper onesided: H 0 : µ = µ 0 vs H 1 : µ > µ 0 To perform a randomization test, simply subtract µ 0 from each observation, and then run the randomization test as you would for paired data. Example: A random sample of 12 fish was taken and the bodyweights recorded. Test the null hypothesis H 0 : µ = 3.0 against the alternative H 1 : µ < 3.0 pounds Based on the randomization test, the pvalue is approximately Therefore, we would reject H 0 : µ = 3.0 and conclude that µ < 3.0 pounds. 114
28 R Code for Randomization Test for Fish Weight Data # Single Sample Randomization Test for Location # Enter the number of randomizations to take Prep = Prep fish < c(2.11,2.22,2.23,2.41,2.54,2.73,2.80,2.80,2.92,3.06,3.12,3.12) # Enter the hypothesized mean mu0 = 3 # Enter vector of differences D < fish  mu0 D < enter mu_0 < subtract mu_0 from the data # Calculate the mean difference meand < mean(d) meand sgnd < sign(meand) Fp = 0 upper < 0 lower < 0 n = length(d) # Begin sign randomizations meanpermd < 1:Prep for (i in 1:Prep){ sgnvec < sign(runif(n).5) permvec < sgnvec*d # random vector with 1 or 1 values # Calculate the mean difference for the i_th randomization vector meanpermd[i] < mean(permvec) if(meanpermd[i]>=meand) upper = upper+1 if(meanpermd[i]<=meand) lower = lower+1 } # Calculate pvalues: # for lower onesided Ho pval_lower < lower/prep pval_lower # for upper onesided H0 pval_upper < upper/prep pval_upper # for twosided Ho if(sgnd < 0) pval_two_sided = if(sgnd > 0) pval_two_sided = pval_two_sided 2*pval_lower 2*pval_upper hist(meanpermd) 115
29 R Output for Randomization Test on Fish Weight Data > # Enter vector of differences [1] > # Calculate the mean difference [1] > # Calculate pvalues: > # for lower onesided Ho [1] > # for upper onesided H0 [1] > # for twosided Ho [1] Histogram of 50,000 Values using the MonteCarlo Approach Histogram of meanpermd Frequency meanpermd 116
Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I
1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and twosample tests 2 / 16 If data do not come from a normal
More informationContents 1. Contents
Contents 1 Contents 1 OneSample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 Onesample Ztest (see Chapter 0.3.1)...... 4 1.1.2 Onesample ttest................. 6 1.1.3 Large sample
More informationSTAT Section 3.4: The Sign Test. The sign test, as we will typically use it, is a method for analyzing paired data.
STAT 518  Section 3.4: The Sign Test The sign test, as we will typically use it, is a method for analyzing paired data. Examples of Paired Data: Similar subjects are paired off and one of two treatments
More informationInference for Binomial Parameters
Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for
More informationModule 9: Nonparametric Statistics Statistics (OA3102)
Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.115.6 Revision: 312 1 Goals for this Lecture
More informationContents KruskalWallis Test Friedman s Twoway Analysis of Variance by Ranks... 47
Contents 1 Nonparametric Tests 3 1.1 Introduction....................................... 3 1.2 Advantages of Nonparametric Tests......................... 4 1.3 Disadvantages of Nonparametric Tests........................
More informationNonparametric methods
Eastern Mediterranean University Faculty of Medicine Biostatistics course Nonparametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationDesign of the Fuzzy Rank Tests Package
Design of the Fuzzy Rank Tests Package Charles J. Geyer July 15, 2013 1 Introduction We do fuzzy P values and confidence intervals following Geyer and Meeden (2005) and Thompson and Geyer (2007) for three
More informationS D / n t n 1 The paediatrician observes 3 =
Nonparametric tests Paired ttest A paediatrician measured the blood cholesterol of her patients and was worried to note that some had levels over 00mg/100ml To investigate whether dietary regulation
More informationB.N.Bandodkar College of Science, Thane. RandomNumber Generation. Mrs M.J.Gholba
B.N.Bandodkar College of Science, Thane RandomNumber Generation Mrs M.J.Gholba Properties of Random Numbers A sequence of random numbers, R, R,., must have two important statistical properties, uniformity
More informationNonparametric Location Tests: ksample
Nonparametric Location Tests: ksample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04Jan2017 Nathaniel E. Helwig (U of Minnesota)
More informationCHAPTER 17 CHISQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NONPARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationDistributionFree Procedures (Devore Chapter Fifteen)
DistributionFree Procedures (Devore Chapter Fifteen) MATH501: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704  Regression Techniques Page 1 Using F tests instead of ttests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationE509A: Principle of Biostatistics. (Week 11(2): Introduction to nonparametric. methods ) GY Zou.
E509A: Principle of Biostatistics (Week 11(2): Introduction to nonparametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189
More informationGlossary for the Triola Statistics Series
Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling
More informationNonParametric Statistics: When Normal Isn t Good Enough"
NonParametric Statistics: When Normal Isn t Good Enough" Professor Ron Fricker" Naval Postgraduate School" Monterey, California" 1/28/13 1 A Bit About Me" Academic credentials" Ph.D. and M.A. in Statistics,
More information9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.
Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationNonparametric Statistics
Nonparametric Statistics Nonparametric or Distributionfree statistics: used when data are ordinal (i.e., rankings) used when ratio/interval data are not normally distributed (data are converted to ranks)
More informationContents 1. Contents
Contents 1 Contents 4 Paired Comparisons & Block Designs 3 4.1 Paired Comparisons.................... 3 4.1.1 Paired Data.................... 3 4.1.2 Existing Approaches................ 6 4.1.3 Pairedcomparison
More informationTwoSample Inferential Statistics
The t Test for Two Independent Samples 1 TwoSample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is
More informationMAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1 MAT 2379, Introduction to Biostatistics Sample Calculator Problems for the Final Exam Note: The exam will also contain some problems
More informationFrequency Distribution CrossTabulation
Frequency Distribution CrossTabulation 1) Overview 2) Frequency Distribution 3) Statistics Associated with Frequency Distribution i. Measures of Location ii. Measures of Variability iii. Measures of Shape
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationIntroduction to Nonparametric Statistics
Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired ttest Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2sample ttest
More informationSample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA
Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationAnalysis of 2x2 CrossOver Designs using TTests
Chapter 234 Analysis of 2x2 CrossOver Designs using TTests Introduction This procedure analyzes data from a twotreatment, twoperiod (2x2) crossover design. The response is assumed to be a continuous
More informationIn many situations, there is a nonparametric test that corresponds to the standard test, as described below:
There are many standard tests like the ttests and analyses of variance that are commonly used. They rest on assumptions like normality, which can be hard to assess: for example, if you have small samples,
More informationOutline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews
Outline Outline PubH 5450 Biostatistics I Prof. Carlin Lecture 11 Confidence Interval for the Mean Known σ (population standard deviation): Part I Reviews σ x ± z 1 α/2 n Small n, normal population. Large
More informationNonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006
Nonparametric Tests Mathematics 47: Lecture 25 Dan Sloughter Furman University April 20, 2006 Dan Sloughter (Furman University) Nonparametric Tests April 20, 2006 1 / 14 The sign test Suppose X 1, X 2,...,
More informationStatistics 135 Fall 2008 Final Exam
Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations
More informationBIOS 6222: Biostatistics II. Outline. Course Presentation. Course Presentation. Review of Basic Concepts. Why Nonparametrics.
BIOS 6222: Biostatistics II Instructors: Qingzhao Yu Don Mercante Cruz Velasco 1 Outline Course Presentation Review of Basic Concepts Why Nonparametrics The sign test 2 Course Presentation Contents Justification
More informationSample Size Determination
Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by
More information1 ONE SAMPLE TEST FOR MEDIAN: THE SIGN TEST
NONPARAMETRIC STATISTICS ONE AND TWO SAMPLE TESTS Nonparametric tests are normally based on ranks of the data samples, and test hypotheses relating to quantiles of the probability distribution representing
More informationNonparametric Tests
Statistics Column Shengping Yang PhD,Gilbert Berdine MD I was working on a small study recently to compare drug metabolite concentrations in the blood between two administration regimes. However, the metabolite
More information16. Nonparametric Methods. Analysis of ordinal data
16. Nonparametric Methods 數 Analysis of ordinal data 料 1 Data : Noninterval data : nominal data, ordinal data Interval data but not normally distributed Nonparametric tests : Two dependent samples pair
More informationpsychological statistics
psychological statistics B Sc. Counselling Psychology 011 Admission onwards III SEMESTER COMPLEMENTARY COURSE UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY.P.O., MALAPPURAM, KERALA,
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationStatistics 100A Homework 5 Solutions
Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to
More informationST3241 Categorical Data Analysis I Twoway Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios
ST3241 Categorical Data Analysis I Twoway Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories
More informationHotelling s One Sample T2
Chapter 405 Hotelling s One Sample T2 Introduction The onesample Hotelling s T2 is the multivariate extension of the common onesample or paired Student s ttest. In a onesample ttest, the mean response
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3  Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationBasics on ttests Independent Sample ttests SingleSample ttests Summary of ttests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I ttests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistieki/ John Nerbonne 1/46 Overview 1 Basics on ttests 2 Independent Sample ttests 3 SingleSample
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationEvaluating Classifiers. Lecture 2 Instructor: Max Welling
Evaluating Classifiers Lecture 2 Instructor: Max Welling Evaluation of Results How do you report classification error? How certain are you about the error you claim? How do you compare two algorithms?
More informationWilcoxon Test and Calculating Sample Sizes
Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When
More informationTwo Sample Problems. Two sample problems
Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent
More informationContinuous Probability Distributions
1 Chapter 5 Continuous Probability Distributions 5.1 Probability density function Example 5.1.1. Revisit Example 3.1.1. 11 12 13 14 15 16 21 22 23 24 25 26 S = 31 32 33 34 35 36 41 42 43 44 45 46 (5.1.1)
More informationSlides 8: Statistical Models in Simulation
Slides 8: Statistical Models in Simulation Purpose and Overview The world the modelbuilder sees is probabilistic rather than deterministic: Some statistical model might well describe the variations. An
More informationInference for Distributions Inference for the Mean of a Population
Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The
More informationInverse Sampling for McNemar s Test
International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 19277032 EISSN 19277040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test
More informationOneSample Numerical Data
OneSample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodnessoffit tests University of California, San Diego Instructor: Ery AriasCastro http://math.ucsd.edu/~eariasca/teaching.html
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 20092018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More information4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49
4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether
More informationAgonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Nonparametric Data?
Agonistic Display in Betta splendens: Data Analysis By Joanna Weremjiwicz, Simeon Yurek, and Dana Krempels Once you have collected data with your ethogram, you are ready to analyze that data to see whether
More informationCHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:
CHAPTER 8 Test of Hypotheses Based on a Single Sample Hypothesis testing is the method that decide which of two contradictory claims about the parameter is correct. Here the parameters of interest are
More informationChapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides
Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for
More informationSuperiority by a Margin Tests for One Proportion
Chapter 103 Superiority by a Margin Tests for One Proportion Introduction This module provides power analysis and sample size calculation for onesample proportion tests in which the researcher is testing
More informationBootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location
Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea
More informationLecture 4. Checking Model Adequacy
Lecture 4. Checking Model Adequacy Montgomery: 34, 151.1 Page 1 Model Checking and Diagnostics Model Assumptions 1 Model is correct 2 Independent observations 3 Errors normally distributed 4 Constant
More informationExample. χ 2 = Continued on the next page. All cells
Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E
More informationSigmaplot di Systat Software
Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easytouse package for complete graphing and data analysis. The statistical
More informationCIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E  8
CIVL  7904/8904 T R A F F I C F L O W T H E O R Y L E C T U R E  8 Chisquare Test How to determine the interval from a continuous distribution I = Range 1 + 3.322(logN) I> Range of the class interval
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: OneSample Tests
Statistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: OneSample Tests 1999 PrenticeHall, Inc. Chap. 81 Chapter Topics Hypothesis Testing Methodology Z Test
More informationWeek 1 Quantitative Analysis of Financial Markets Distributions A
Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More informationHypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.
Hypothesis Tests and Estimation for Population Variances 111 Learning Outcomes Outcome 1. Formulate and carry out hypothesis tests for a single population variance. Outcome 2. Develop and interpret confidence
More informationStatistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other CSample Tests With Numerical Data
Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other CSample Tests With Numerical Data 1999 PrenticeHall, Inc. Chap. 101 Chapter Topics The Completely Randomized Model: OneFactor
More informationPurposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions
Part 1: Probability Distributions Purposes of Data Analysis True Distributions or Relationships in the Earths System Probability Distribution Normal Distribution Studentt Distribution Chi Square Distribution
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationDo not copy, post, or distribute. IndependentSamples t Test and Mann C h a p t e r 13
C h a p t e r 13 IndependentSamples t Test and Mann Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In
More informationA nonparametric twosample wald test of equality of variances
University of Wollongong Research Online Faculty of Informatics  Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric twosample wald test of equality of variances David
More informationBootstrapping, Randomization, 2BPLS
Bootstrapping, Randomization, 2BPLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,
More informationLecture 10: Comparing two populations: proportions
Lecture 10: Comparing two populations: proportions Problem: Compare two sets of sample data: e.g. is the proportion of As in this semester 152 the same as last Fall? Methods: Extend the methods introduced
More informationRanksum Test Based on Order Restricted Randomized Design
Ranksum Test Based on Order Restricted Randomized Design Omer Ozturk and Yiping Sun Abstract One of the main principles in a design of experiment is to use blocking factors whenever it is possible. On
More informationIntro to Parametric & Nonparametric Statistics
Kinds of variable The classics & some others Intro to Parametric & Nonparametric Statistics Kinds of variables & why we care Kinds & definitions of nonparametric statistics Where parametric stats come
More informationIt can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with
Example: Alternative calculation of mean and variance of binomial distribution A r.v. X has the Bernoulli distribution if it takes the values 1 ( success ) or 0 ( failure ) with probabilities p and (1
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL  MAY 2005 EXAMINATIONS STA 248 H1S. Duration  3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL  MAY 2005 EXAMINATIONS STA 248 H1S Duration  3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 17 pages including
More informationNonparametric Methods
Nonparametric Methods Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Nonparametric Methods, or Distribution Free Methods is for testing from a population without knowing anything about the
More informationBiostatistics Quantitative Data
Biostatistics Quantitative Data Descriptive Statistics Statistical Models Onesample and TwoSample Tests Introduction to SASANALYST T and RankTests using ANALYST Thomas Scheike Quantitative Data This
More informationMath 2200 Fall 2014, Exam 3 You may use any calculator. You may use a 4 6 inch notecard as a cheat sheet.
1 Math 2200 Fall 2014, Exam 3 You may use any calculator. You may use a 4 6 inch notecard as a cheat sheet. Warning to the Reader! If you are a student for whom this document is a historical artifact,
More informationSleep data, two drugs Ch13.xls
Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired ttest) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch
More informationSTAT 135 Lab 8 Hypothesis Testing Review, MannWhitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.
STAT 135 Lab 8 Hypothesis Testing Review, MannWhitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015 MannWhitney Test MannWhitney Test Recall that the MannWhitney
More informationEstimating the accuracy of a hypothesis Setting. Assume a binary classification setting
Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier
More information7.3 Ridge Analysis of the Response Surface
7.3 Ridge Analysis of the Response Surface When analyzing a fitted response surface, the researcher may find that the stationary point is outside of the experimental design region, but the researcher wants
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationNONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDENDAY, INC. McGRAWHILL INTERNATIONAL BOOK COMPANY
NONPARAMETRICS Statistical Methods Based on Ranks E. L. LEHMANN University of California, Berkeley With the special assistance of H. J. M. D'ABRERA University of California, Berkeley HOLDENDAY, INC. San
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationReview 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2
Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that
More informationMeasuring relationships among multiple responses
Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pairwise responses is an important property used in almost all multivariate analyses.
More informationChapter 11. Analysis of Variance (OneWay)
Chapter 11 Analysis of Variance (OneWay) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 31 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationChapter 7: Theoretical Probability Distributions Variable  Measured/Categorized characteristic
BSTT523: Pagano & Gavreau, Chapter 7 1 Chapter 7: Theoretical Probability Distributions Variable  Measured/Categorized characteristic Random Variable (R.V.) X Assumes values (x) by chance Discrete R.V.
More informationChapter 16. Nonparametric Tests
Chapter 16 Nonparametric Tests The statistical tests we have examined so far are called parametric tests, because they assume the data have a known distribution, such as the normal, and test hypotheses
More informationUsing Tables and Graphing Calculators in Math 11
Using Tables and Graphing Calculators in Math 11 Graphing calculators are not required for Math 11, but they are likely to be helpful, primarily because they allow you to avoid the use of tables in some
More information