Data Analysis and Statistical Methods Statistics 651

Size: px
Start display at page:

Download "Data Analysis and Statistical Methods Statistics 651"

Transcription

1 Data Analysis and Statistical Methods Statistics 65 Suhasini Subba Rao Review In the previous lecture we considered the following tests: The independent sample t-test This when we have two completely independent samples drawn from two populations, and we want to compare their population means. If the sample sizes are relatively large and there are few outliers we can test equality of the means (one-sided versions) using the independent sample t-test. Wilcoxon sum rank test This when we have two completely independent samples drawn from two populations, and we want to compare their population means (or distributions). If the sample sizes are small and there appears to outliers we can test equility of their distributions (means) using the Wilcoxon sum rank test (we do not have standard errors in this test). The paired t-test This is when we have paired observations, each of the pairs coming from different populations. Eg. the running time of a runner at high and low altitudes. In this case the pairs are dependent, we cannot use either of the above tests because they are dependent. We can check for dependence by plotting the pairs against each other (eg. high altitude against low altitude). If sample size is relatively large and their are not many outliers use a paired t-test. The wilcox sigh rank test Today s class. A quick review on the idea of testing Suppose a bomb was exploded at the center 0. The bomb spreads debris everywhere. The closer you are to the bomb the more likely the debris will hit you. The standard derivation of the bomb spread distance (remember it is a measure of the amount of spread) is 3 miles (this means the variance is 9). Supposing the spread of the debris has a normal distribution - basically this means that about 95% percent of the debris will be spread over.96 3 radius (meaning either side) of 0. The other 5% will outside this radius. Suppose I am standing 4 miles from the center (0). Do you think the bomb will hit me? Suppose I am standing 9 miles to the right of the center, do you think I will get hit? Whats the probability I will get hit? It is P(Z > 9 3 ) = P(Z > 3) = This is very small. 2 3

2 Suppose I am standing 9 miles to the right of the center and I did get hit. Everyone is telling me the bomb was exploded at 0. But I know the the probability of me being hit when the bomb is located at center 0 is 0.%. I am suspicious, this number is really small - though I don t know the exact location of the bomb. But I am pretty sure that its not at 0 or any location less than 0. This value is a p-value. Its the probability of being hit when the center is zero. If its small its unlikely the center is zero. This means rejecting the null that the bomb is located at zero. Another story A more scattering bomb is being exploded at center 0. The standard deviation (measure of spread) of the debris is 6 miles (this means the variance is 36). This means that that 95% of the debris will be spread within.96 6 miles of the center 0. I am standing 9 miles away from the bomb. The chance of me being hit is P(Z > 9 6 ) = This means almost a 0% chance of me being hit when the bomb is at zero. Suppose I get hit, then I cannot say that the bomb is at zero, but I cannot say that its not at zero is the p-value for me being hit when the bomb was exploded at zero. This means we cannot reject the null: that the bomb is located at zero. 4 5 What have you learnt from my stories I am an idiot for standing so close to the bombs. The distance that is safe to stay away from the bomb depends on the standard deviation (the variance). The smaller the ratio: distance/(standard deviation), the more likely I am to be hit when the center is zero. The larger the ratio distance/(standard deviation) the less likely I am to be hit when the center is zero. A quick review of JMP output Typically the output in JMP looks like: Mean (difference) std. error Upper 95% Lower 95% N t ratio DF Prob > t Mean is the sample mean, or the differences in the sample mean. Std.Error refers to the estimated standard deviation of the estimator (sample mean). N refers to the sample size. 6 7

3 t-ratio refers to the z-transform of the estimator. Usually t = M ean/std.error DF refs to the degrees of freedom of the t-distribution used. Prob > t refers to 2 p-value, when testing hypothesis H 0 : true mean = 0 (or difference in means is equal to zero) against the alternative H A : true mean not equal to zero (or difference in means is not equal to zero). JMP output and what you should be able to do with it Using this output you should be able to: Construct 90% CIs and 99% CIs etc. Test the hypothesis H 0 : true mean = 5, for example, (or difference in means is equal to five) against the alternative H A : true mean not equal to 5, for example, (or difference in means is not equal to zero). 8 9 Tests we have done so far Unstandardised Coefficients Example B Std. Error t Sig one sample mean independent samples t-test paired t-test X D D q s 2 n r s 2 n p + m r s 2 d n X µ 0 q s 2 n 0 P n X µ 0 q s 2 n A r D s 2 n p + m P t n+m 2 r D C s 2 n p + m A D r s 2 d n 0 P t n r D s 2 C A d n Remember means taking the positive value of a number eg 3 = 3 etc. Remember standard deviation is a measure of spread. If we have normality then a 95% CI for the true parameter is [B.96 Std.Error, B +.96 Std.Error]. Roughly, this means it is highly likely the true mean lies in this interval. In the above table the p-value is evaluated when testing if the true parameters coefficient is zero. That is: Example B hypothesis, p-value evaluted under the null sample mean X H0 : µ = µ 0 H A : µ µ 0 t-test D H0 : µ µ 2 = 0 H A : µ µ 2 0 paired t-test D H0 : µ µ 2 = 0 H A : µ µ 2 0 0

4 Example I: Runners at altitude Runners were compared at a high and low altitude. For each runner, the running time was measured at a high altitude and then again at a low altitude. 2 runners were used. Runner High Low Do you think altitude has an effect on running time? Let µ Y denote the mean time at a low altitude and µ X denote the mean time at a high altitude. Use α = We want to test H 0 : µ Y µ X = µ d 0 against the alternative H A : µ Y µ X = µ d < 0. Solution I: Using the paired t-test Runner High Low Low - High Using the sample differences, calculate the sample mean and sample variance: D =.2 and s 2 d = {(0.7.2) (.8.2) 2 } =.6. Since D =.2, which is on the side of the alternative (noting it is a one-sided test), we can do the test. 2 3 Under the null hypothesis Example II: The use of cell phones D t(2 ). s 2 d /2 Now we construct a rejection region. The rejection region is towards the left hand side (we see this from the alternative). Hence we reject the null if D = Ȳ X s =.2 is less than 0 t 0.05 () 2 d 2 = = Since D =.2 is less than we can reject the null. Equivalently. Using JMP we can get the p-value we see that P(t 3.89) = , the p-value is so small that there is enough evidence to reject the null. There has been a lot of speculation that the use of cell phones while driving has increased the number of accidents. Scientists wanted to test whether talking on a cell phone increased a drivers reaction time. To test the hypothesis they randomly sampled 30 people and placed each of them in a car simulator. For each driver the reaction time to the sudden appearance of a colour stimuli when the driver was not on the the phone and when the driver was on the phone was recorded. Based on this sample the average reaction time when not on the phone was X = 0.5 seconds. The average response time when on the phone was Ȳ = 0.7 seconds. The pooled sample variance is s 2 p = (s 2 p = 0.6) and the sample variance of the differences is s 2 d = 0.2 (s 2 d = 0.0). Is there evidence to suggest that using a cell phone increases the reaction 4 5

5 time while driving (state the test you would use, the hypothesis and do the test at the 5% level)? Solutions II Because the same person is being used in both experiments (and it is highly likely that reaction time is individual dependent), it would be wise to do a paired based test (since there is likely to be dependence between the pairs). Let µ X be the mean reaction time not on the phone and µ Y be the mean reaction time when on the phone. Let µ d = µ Y µ X, this is the mean reaction on the phone minus mean rection time when off the phone. We do not observe µ X, µ Y and µ d. However, it is conjectured that reaction time increases with cell phone use, if this is true then µ d = µ Y µ X > 0 (eg. average reaction time on phone minus average reaction time without phone is great than zero). Hence we want to test H 0 : µ d 0 against H A : µ d > 0. Since the pairs in the data are dependent, we use the paired t-test. This means to do 6 7 the test we use the sample variance of the differences which is s 2 d = 0.0 and as an estimate of µ d we use ˆD = Ȳ X = = 0.2. Using D = 0.2, s 2 d = 0.0 and (n=30 - there are n pairs) we can do the test just as if it were a one-sample test but using the differences instead (hence we use a t-test - since we estimate the variance). We note that s the standard error is 2 d 0.0 n = 30 = Question Suppose we want to test that the average reaction time increases by more than 0. second, what would be the conclusion of the test be (using α = 5%)? We construct a 5% rejection region (RR), it is on the right hand side (since H A is pointing right). Remember to construct the RR we need to center it about the mean in the null which is 0, hence the RR is any value D s greater than 0 + t 29 (0.05) 2 d n = = Since D = 0.2 > 0.03, there is evidence to reject the null. 8 9

6 What to do when the number of pairs is small? The sample size n = 6 is small, and we have used a t-distribution from 5 degrees of freedom (which means it is not very sensitive at detecting effects, since t (5) is quite large). [But despite this we have still rejected the null hypothesis]. Important For obvious reasons a paired t-test is only possible when the two samples are of the same size. But to do the paired t-test we require the usual assumptions. If n is small, then the observations must from a normal distribution (difficult to check in practice). If n is large it does not matter because D will be almost normal. The small sample size begs the question; is there nonparametric version of this test, which does not require normality of D i?. A Nonparametric alternative: The Wilcoxon Sign-Rank test (uses Table 6) This is to test H 0 : µ d = 0 against H A : µ d 0. We do not require normality of D i, but the distribution of the differences D i must be symmetric about the median (one of the main assumptions in this test). Hence the test is equivalent to checking whether the median of the difference is zero, against the alternative that it is not zero. Recipe: Calculate the difference between the pairs of samples D i = X i Y i Delete all zero values and let n be the number of non-zero values. List the absolute values of the differences. To each rank give sign (negative or positive), depending on whether the difference is negative or positive. Add all the negative ranks together, call this T +. Add all the positive ranks together, call this T. Find the smaller of T and T +, and label this T (T = min(t, T + )). Wilcoxon sign-rank test for the Friday 3th data Sam. Sam. 2 Difference sign Abs. Rank Total T = 20.5 T + =.5 Look up Table 6. The columns are the sample size of the pairs, depending on whether you are doing a two-sided or one-sided test and α, is select the appropriate value. If T is less than this value, there is enough evidence to reject the null. The test is H 0 : is the mean number of accidents which happen on Friday 3th and 6th are the same (µ d = 0), against the alternative H A : the mean number of accidents which happen on Friday 3th and 6th are different

7 The Wilcoxon sign rank test and Table 6 T + = sum of positive ranks (T = 20.5). T = sum of negative ranks (T + =.5). Though I think this is just because the sample size is too small and the power of the test is zero! If we increased the level to α = 0.2, then Value= 2 and we would be able reject the null. Based on this test, what are your conclusions about Friday 3th??? T = smaller of T + and T (in our example T =.5). Look up Table 6, for a particular p = α and n (so in our case we use a two-sided test with p = 0.05 and n = 6). Reject H 0 if T < value in the table. Looking up the tables with p = 0.05 and n = 6 we see that Value= 0. Since.5 > 0. Using the Wilcoxon signed-rank test we do not have enough evidence to reject the null hypothesis Reminder: What test to do when... If we want to test whether two samples come from the same population (or whether one distribution is a shift of another) and the both samples are of the same size and pairs of observations are independent and the sample size is large or small and data normal then use the t-test. the sample size is small and the data not normal then use the Wilcoxon rank sum (Mann-Whitney U) test. pairs of observations are dependent and the sample size is large or small and data normal then use the paired t-test. the sample size is small and the data not normal then use the Wilcoxon signed rank test. We should use the test which suits the data. Because loss of power can occur if we use the wrong test. Aside What happens when we use the wrong test: For the t-test comparing two sample with the same size we have that the non-rejection region is 2 [ t α/2 (2n 2)s p n, t 2 α/2(2n 2)s n ], s p pooled variance. For the paired t-test we use a t-distribution, and the non-rejection region is [ t α/2 (n )s n, t α/2(n )s d n ], s d = n i (d i d)

8 Remember s d and s p are different. In the case of independent pairs s d 2s p. By comparing the non-rejection regions we see: If there is large dependence between pairs and we mistakenly use the t-test we loose power because s is large. If there is only a small dependence between pairs and we mistakenly use the paired t-test we loose power because we are using a t(n ) rather than t(2n 2). Example: Runners at altitude Runners were compared at a high and low altitude. For each runner, the running time was measure at a high altitude and then again at a low altitude. 2 runners were used. Runner High Low Use the Wilcoxon sign rank test to test the hypothesis that the median difference is the same against the alternative that it is different Aside: The normal approximation of the sign-rank test When n > 50 we use a normal approximation. Let µ T = n (n + ) 4 σt 2 = n (n + )(2n + ). 24 Under the null Z = T µ T σ 2 T N(0,). Calculate P(Z < z ). If this is smaller than α/2 reject H 0. 30

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females

More information

The independent-means t-test:

The independent-means t-test: The independent-means t-test: Answers the question: is there a "real" difference between the two conditions in my experiment? Or is the difference due to chance? Previous lecture: (a) Dependent-means t-test:

More information

Comparison of Two Population Means

Comparison of Two Population Means Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Solutions exercises of Chapter 7

Solutions exercises of Chapter 7 Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are

More information

Introduction to hypothesis testing

Introduction to hypothesis testing Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Business Statistics. Lecture 5: Confidence Intervals

Business Statistics. Lecture 5: Confidence Intervals Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in

More information

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6. Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Distribution-Free Procedures (Devore Chapter Fifteen)

Distribution-Free Procedures (Devore Chapter Fifteen) Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba

More information

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores

More information

+ Specify 1 tail / 2 tail

+ Specify 1 tail / 2 tail Week 2: Null hypothesis Aeroplane seat designer wonders how wide to make the plane seats. He assumes population average hip size μ = 43.2cm Sample size n = 50 Question : Is the assumption μ = 43.2cm reasonable?

More information

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Rama Nada. -Ensherah Mokheemer. 1 P a g e - 9 - Rama Nada -Ensherah Mokheemer - 1 P a g e Quick revision: Remember from the last lecture that chi square is an example of nonparametric test, other examples include Kruskal Wallis, Mann Whitney and

More information

Physics 509: Non-Parametric Statistics and Correlation Testing

Physics 509: Non-Parametric Statistics and Correlation Testing Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests

More information

Chapter 18 Resampling and Nonparametric Approaches To Data

Chapter 18 Resampling and Nonparametric Approaches To Data Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Wilcoxon Test and Calculating Sample Sizes

Wilcoxon Test and Calculating Sample Sizes Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent

More information

Resampling Methods. Lukas Meier

Resampling Methods. Lukas Meier Resampling Methods Lukas Meier 20.01.2014 Introduction: Example Hail prevention (early 80s) Is a vaccination of clouds really reducing total energy? Data: Hail energy for n clouds (via radar image) Y i

More information

Chapter 7 Comparison of two independent samples

Chapter 7 Comparison of two independent samples Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N

More information

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2 Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that

More information

Lecture 7: Hypothesis Testing and ANOVA

Lecture 7: Hypothesis Testing and ANOVA Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in

More information

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test?

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test? Violating the normal distribution assumption So what do you do if the data are not normal and you still need to perform a test? Remember, if your n is reasonably large, don t bother doing anything. Your

More information

Background to Statistics

Background to Statistics FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Module 9: Nonparametric Statistics Statistics (OA3102)

Module 9: Nonparametric Statistics Statistics (OA3102) Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Boxplots and standard deviations Suhasini Subba Rao Review of previous lecture In the previous lecture

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp

Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Lecture Notes 1 Confidence intervals on mean Normal Distribution CL = x ± t * 1-α 1- α,n-1 s n Log-Normal Distribution CL = exp 1-α CL1-

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

Chapter 6. Estimates and Sample Sizes

Chapter 6. Estimates and Sample Sizes Chapter 6 Estimates and Sample Sizes Lesson 6-1/6-, Part 1 Estimating a Population Proportion This chapter begins the beginning of inferential statistics. There are two major applications of inferential

More information

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13 C h a p t e r 13 Independent-Samples t Test and Mann- Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In

More information

3. Nonparametric methods

3. Nonparametric methods 3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1 PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

Ch. 7. One sample hypothesis tests for µ and σ

Ch. 7. One sample hypothesis tests for µ and σ Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding

More information

Analysis of variance (ANOVA) Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments

More information

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc. Chapter 23 Inferences About Means Sampling Distributions of Means Now that we know how to create confidence intervals and test hypotheses about proportions, we do the same for means. Just as we did before,

More information

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test

More information

Correlation and Regression

Correlation and Regression Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should

More information

Analysis of 2x2 Cross-Over Designs using T-Tests

Analysis of 2x2 Cross-Over Designs using T-Tests Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous

More information

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)

More information

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline)

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Hypothesis Testing hypothesis testing approach formulation of the test statistic Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Things you always wanted to know about statistics but were afraid to ask

Things you always wanted to know about statistics but were afraid to ask Things you always wanted to know about statistics but were afraid to ask Christoph Amma Felix Putze Design and Evaluation of Innovative User Interfaces 6.12.13 1/43 Overview In the last lecture, we learned

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

psychological statistics

psychological statistics psychological statistics B Sc. Counselling Psychology 011 Admission onwards III SEMESTER COMPLEMENTARY COURSE UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY.P.O., MALAPPURAM, KERALA,

More information

Intuitive Biostatistics: Choosing a statistical test

Intuitive Biostatistics: Choosing a statistical test pagina 1 van 5 < BACK Intuitive Biostatistics: Choosing a statistical This is chapter 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc.

More information

Non-parametric methods

Non-parametric methods Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Non-parametric (Distribution-free) approaches p188 CN

Non-parametric (Distribution-free) approaches p188 CN Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14

More information

WELCOME! Lecture 13 Thommy Perlinger

WELCOME! Lecture 13 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y X 1 1 One dependent variable

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review of previous lecture We showed if S n were a binomial random variable, where

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I. Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample

More information

Pooled Variance t Test

Pooled Variance t Test Pooled Variance t Test Tests means of independent populations having equal variances Parametric test procedure Assumptions Both populations are normally distributed If not normal, can be approximated by

More information

MAT Mathematics in Today's World

MAT Mathematics in Today's World MAT 1000 Mathematics in Today's World Last Time 1. Three keys to summarize a collection of data: shape, center, spread. 2. Can measure spread with the fivenumber summary. 3. The five-number summary can

More information

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter

More information

Descriptive Statistics CE 311S

Descriptive Statistics CE 311S CE 311S MEASURES OF LOCATION AND VARIABILITY As a starting point, we need a way to briefly summarize an entire sample with simple numerical values. This is the realm of descriptive statistics. For now,

More information

Contrasts and Multiple Comparisons Supplement for Pages

Contrasts and Multiple Comparisons Supplement for Pages Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8 Concepts from previous lectures HUMBEHV 3HB3 one-sample t-tests Week 8 Prof. Patrick Bennett sampling distributions - sampling error - standard error of the mean - degrees-of-freedom Null and alternative/research

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 9 (MWF) Calculations for the normal distribution Suhasini Subba Rao Evaluating probabilities

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Power and nonparametric methods Basic statistics for experimental researchersrs 2017

Power and nonparametric methods Basic statistics for experimental researchersrs 2017 Faculty of Health Sciences Outline Power and nonparametric methods Basic statistics for experimental researchersrs 2017 Statistical power Julie Lyng Forman Department of Biostatistics, University of Copenhagen

More information

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

Statistics Handbook. All statistical tables were computed by the author.

Statistics Handbook. All statistical tables were computed by the author. Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance

More information

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p). Sampling distributions and estimation. 1) A brief review of distributions: We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation,

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:

More information

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p. Nonparametric s Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, November 8, 2005 Nonparametric s - p. 1/31 Overview The sign, motivation The Mann-Whitney Larger Larger, in pictures

More information

MORE ON MULTIPLE REGRESSION

MORE ON MULTIPLE REGRESSION DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON MULTIPLE REGRESSION I. AGENDA: A. Multiple regression 1. Categorical variables with more than two categories 2. Interaction

More information

Dealing with the assumption of independence between samples - introducing the paired design.

Dealing with the assumption of independence between samples - introducing the paired design. Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Data analysis and Geostatistics - lecture VII

Data analysis and Geostatistics - lecture VII Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with

More information

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information