Data Analysis and Statistical Methods Statistics 651
|
|
- Shon Baldwin
- 5 years ago
- Views:
Transcription
1 Data Analysis and Statistical Methods Statistics 65 Suhasini Subba Rao Review In the previous lecture we considered the following tests: The independent sample t-test This when we have two completely independent samples drawn from two populations, and we want to compare their population means. If the sample sizes are relatively large and there are few outliers we can test equality of the means (one-sided versions) using the independent sample t-test. Wilcoxon sum rank test This when we have two completely independent samples drawn from two populations, and we want to compare their population means (or distributions). If the sample sizes are small and there appears to outliers we can test equility of their distributions (means) using the Wilcoxon sum rank test (we do not have standard errors in this test). The paired t-test This is when we have paired observations, each of the pairs coming from different populations. Eg. the running time of a runner at high and low altitudes. In this case the pairs are dependent, we cannot use either of the above tests because they are dependent. We can check for dependence by plotting the pairs against each other (eg. high altitude against low altitude). If sample size is relatively large and their are not many outliers use a paired t-test. The wilcox sigh rank test Today s class. A quick review on the idea of testing Suppose a bomb was exploded at the center 0. The bomb spreads debris everywhere. The closer you are to the bomb the more likely the debris will hit you. The standard derivation of the bomb spread distance (remember it is a measure of the amount of spread) is 3 miles (this means the variance is 9). Supposing the spread of the debris has a normal distribution - basically this means that about 95% percent of the debris will be spread over.96 3 radius (meaning either side) of 0. The other 5% will outside this radius. Suppose I am standing 4 miles from the center (0). Do you think the bomb will hit me? Suppose I am standing 9 miles to the right of the center, do you think I will get hit? Whats the probability I will get hit? It is P(Z > 9 3 ) = P(Z > 3) = This is very small. 2 3
2 Suppose I am standing 9 miles to the right of the center and I did get hit. Everyone is telling me the bomb was exploded at 0. But I know the the probability of me being hit when the bomb is located at center 0 is 0.%. I am suspicious, this number is really small - though I don t know the exact location of the bomb. But I am pretty sure that its not at 0 or any location less than 0. This value is a p-value. Its the probability of being hit when the center is zero. If its small its unlikely the center is zero. This means rejecting the null that the bomb is located at zero. Another story A more scattering bomb is being exploded at center 0. The standard deviation (measure of spread) of the debris is 6 miles (this means the variance is 36). This means that that 95% of the debris will be spread within.96 6 miles of the center 0. I am standing 9 miles away from the bomb. The chance of me being hit is P(Z > 9 6 ) = This means almost a 0% chance of me being hit when the bomb is at zero. Suppose I get hit, then I cannot say that the bomb is at zero, but I cannot say that its not at zero is the p-value for me being hit when the bomb was exploded at zero. This means we cannot reject the null: that the bomb is located at zero. 4 5 What have you learnt from my stories I am an idiot for standing so close to the bombs. The distance that is safe to stay away from the bomb depends on the standard deviation (the variance). The smaller the ratio: distance/(standard deviation), the more likely I am to be hit when the center is zero. The larger the ratio distance/(standard deviation) the less likely I am to be hit when the center is zero. A quick review of JMP output Typically the output in JMP looks like: Mean (difference) std. error Upper 95% Lower 95% N t ratio DF Prob > t Mean is the sample mean, or the differences in the sample mean. Std.Error refers to the estimated standard deviation of the estimator (sample mean). N refers to the sample size. 6 7
3 t-ratio refers to the z-transform of the estimator. Usually t = M ean/std.error DF refs to the degrees of freedom of the t-distribution used. Prob > t refers to 2 p-value, when testing hypothesis H 0 : true mean = 0 (or difference in means is equal to zero) against the alternative H A : true mean not equal to zero (or difference in means is not equal to zero). JMP output and what you should be able to do with it Using this output you should be able to: Construct 90% CIs and 99% CIs etc. Test the hypothesis H 0 : true mean = 5, for example, (or difference in means is equal to five) against the alternative H A : true mean not equal to 5, for example, (or difference in means is not equal to zero). 8 9 Tests we have done so far Unstandardised Coefficients Example B Std. Error t Sig one sample mean independent samples t-test paired t-test X D D q s 2 n r s 2 n p + m r s 2 d n X µ 0 q s 2 n 0 P n X µ 0 q s 2 n A r D s 2 n p + m P t n+m 2 r D C s 2 n p + m A D r s 2 d n 0 P t n r D s 2 C A d n Remember means taking the positive value of a number eg 3 = 3 etc. Remember standard deviation is a measure of spread. If we have normality then a 95% CI for the true parameter is [B.96 Std.Error, B +.96 Std.Error]. Roughly, this means it is highly likely the true mean lies in this interval. In the above table the p-value is evaluated when testing if the true parameters coefficient is zero. That is: Example B hypothesis, p-value evaluted under the null sample mean X H0 : µ = µ 0 H A : µ µ 0 t-test D H0 : µ µ 2 = 0 H A : µ µ 2 0 paired t-test D H0 : µ µ 2 = 0 H A : µ µ 2 0 0
4 Example I: Runners at altitude Runners were compared at a high and low altitude. For each runner, the running time was measured at a high altitude and then again at a low altitude. 2 runners were used. Runner High Low Do you think altitude has an effect on running time? Let µ Y denote the mean time at a low altitude and µ X denote the mean time at a high altitude. Use α = We want to test H 0 : µ Y µ X = µ d 0 against the alternative H A : µ Y µ X = µ d < 0. Solution I: Using the paired t-test Runner High Low Low - High Using the sample differences, calculate the sample mean and sample variance: D =.2 and s 2 d = {(0.7.2) (.8.2) 2 } =.6. Since D =.2, which is on the side of the alternative (noting it is a one-sided test), we can do the test. 2 3 Under the null hypothesis Example II: The use of cell phones D t(2 ). s 2 d /2 Now we construct a rejection region. The rejection region is towards the left hand side (we see this from the alternative). Hence we reject the null if D = Ȳ X s =.2 is less than 0 t 0.05 () 2 d 2 = = Since D =.2 is less than we can reject the null. Equivalently. Using JMP we can get the p-value we see that P(t 3.89) = , the p-value is so small that there is enough evidence to reject the null. There has been a lot of speculation that the use of cell phones while driving has increased the number of accidents. Scientists wanted to test whether talking on a cell phone increased a drivers reaction time. To test the hypothesis they randomly sampled 30 people and placed each of them in a car simulator. For each driver the reaction time to the sudden appearance of a colour stimuli when the driver was not on the the phone and when the driver was on the phone was recorded. Based on this sample the average reaction time when not on the phone was X = 0.5 seconds. The average response time when on the phone was Ȳ = 0.7 seconds. The pooled sample variance is s 2 p = (s 2 p = 0.6) and the sample variance of the differences is s 2 d = 0.2 (s 2 d = 0.0). Is there evidence to suggest that using a cell phone increases the reaction 4 5
5 time while driving (state the test you would use, the hypothesis and do the test at the 5% level)? Solutions II Because the same person is being used in both experiments (and it is highly likely that reaction time is individual dependent), it would be wise to do a paired based test (since there is likely to be dependence between the pairs). Let µ X be the mean reaction time not on the phone and µ Y be the mean reaction time when on the phone. Let µ d = µ Y µ X, this is the mean reaction on the phone minus mean rection time when off the phone. We do not observe µ X, µ Y and µ d. However, it is conjectured that reaction time increases with cell phone use, if this is true then µ d = µ Y µ X > 0 (eg. average reaction time on phone minus average reaction time without phone is great than zero). Hence we want to test H 0 : µ d 0 against H A : µ d > 0. Since the pairs in the data are dependent, we use the paired t-test. This means to do 6 7 the test we use the sample variance of the differences which is s 2 d = 0.0 and as an estimate of µ d we use ˆD = Ȳ X = = 0.2. Using D = 0.2, s 2 d = 0.0 and (n=30 - there are n pairs) we can do the test just as if it were a one-sample test but using the differences instead (hence we use a t-test - since we estimate the variance). We note that s the standard error is 2 d 0.0 n = 30 = Question Suppose we want to test that the average reaction time increases by more than 0. second, what would be the conclusion of the test be (using α = 5%)? We construct a 5% rejection region (RR), it is on the right hand side (since H A is pointing right). Remember to construct the RR we need to center it about the mean in the null which is 0, hence the RR is any value D s greater than 0 + t 29 (0.05) 2 d n = = Since D = 0.2 > 0.03, there is evidence to reject the null. 8 9
6 What to do when the number of pairs is small? The sample size n = 6 is small, and we have used a t-distribution from 5 degrees of freedom (which means it is not very sensitive at detecting effects, since t (5) is quite large). [But despite this we have still rejected the null hypothesis]. Important For obvious reasons a paired t-test is only possible when the two samples are of the same size. But to do the paired t-test we require the usual assumptions. If n is small, then the observations must from a normal distribution (difficult to check in practice). If n is large it does not matter because D will be almost normal. The small sample size begs the question; is there nonparametric version of this test, which does not require normality of D i?. A Nonparametric alternative: The Wilcoxon Sign-Rank test (uses Table 6) This is to test H 0 : µ d = 0 against H A : µ d 0. We do not require normality of D i, but the distribution of the differences D i must be symmetric about the median (one of the main assumptions in this test). Hence the test is equivalent to checking whether the median of the difference is zero, against the alternative that it is not zero. Recipe: Calculate the difference between the pairs of samples D i = X i Y i Delete all zero values and let n be the number of non-zero values. List the absolute values of the differences. To each rank give sign (negative or positive), depending on whether the difference is negative or positive. Add all the negative ranks together, call this T +. Add all the positive ranks together, call this T. Find the smaller of T and T +, and label this T (T = min(t, T + )). Wilcoxon sign-rank test for the Friday 3th data Sam. Sam. 2 Difference sign Abs. Rank Total T = 20.5 T + =.5 Look up Table 6. The columns are the sample size of the pairs, depending on whether you are doing a two-sided or one-sided test and α, is select the appropriate value. If T is less than this value, there is enough evidence to reject the null. The test is H 0 : is the mean number of accidents which happen on Friday 3th and 6th are the same (µ d = 0), against the alternative H A : the mean number of accidents which happen on Friday 3th and 6th are different
7 The Wilcoxon sign rank test and Table 6 T + = sum of positive ranks (T = 20.5). T = sum of negative ranks (T + =.5). Though I think this is just because the sample size is too small and the power of the test is zero! If we increased the level to α = 0.2, then Value= 2 and we would be able reject the null. Based on this test, what are your conclusions about Friday 3th??? T = smaller of T + and T (in our example T =.5). Look up Table 6, for a particular p = α and n (so in our case we use a two-sided test with p = 0.05 and n = 6). Reject H 0 if T < value in the table. Looking up the tables with p = 0.05 and n = 6 we see that Value= 0. Since.5 > 0. Using the Wilcoxon signed-rank test we do not have enough evidence to reject the null hypothesis Reminder: What test to do when... If we want to test whether two samples come from the same population (or whether one distribution is a shift of another) and the both samples are of the same size and pairs of observations are independent and the sample size is large or small and data normal then use the t-test. the sample size is small and the data not normal then use the Wilcoxon rank sum (Mann-Whitney U) test. pairs of observations are dependent and the sample size is large or small and data normal then use the paired t-test. the sample size is small and the data not normal then use the Wilcoxon signed rank test. We should use the test which suits the data. Because loss of power can occur if we use the wrong test. Aside What happens when we use the wrong test: For the t-test comparing two sample with the same size we have that the non-rejection region is 2 [ t α/2 (2n 2)s p n, t 2 α/2(2n 2)s n ], s p pooled variance. For the paired t-test we use a t-distribution, and the non-rejection region is [ t α/2 (n )s n, t α/2(n )s d n ], s d = n i (d i d)
8 Remember s d and s p are different. In the case of independent pairs s d 2s p. By comparing the non-rejection regions we see: If there is large dependence between pairs and we mistakenly use the t-test we loose power because s is large. If there is only a small dependence between pairs and we mistakenly use the paired t-test we loose power because we are using a t(n ) rather than t(2n 2). Example: Runners at altitude Runners were compared at a high and low altitude. For each runner, the running time was measure at a high altitude and then again at a low altitude. 2 runners were used. Runner High Low Use the Wilcoxon sign rank test to test the hypothesis that the median difference is the same against the alternative that it is different Aside: The normal approximation of the sign-rank test When n > 50 we use a normal approximation. Let µ T = n (n + ) 4 σt 2 = n (n + )(2n + ). 24 Under the null Z = T µ T σ 2 T N(0,). Calculate P(Z < z ). If this is smaller than α/2 reject H 0. 30
Data Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 65 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Comparing populations Suppose I want to compare the heights of males and females
More informationThe independent-means t-test:
The independent-means t-test: Answers the question: is there a "real" difference between the two conditions in my experiment? Or is the difference due to chance? Previous lecture: (a) Dependent-means t-test:
More informationComparison of Two Population Means
Comparison of Two Population Means Esra Akdeniz March 15, 2015 Independent versus Dependent (paired) Samples We have independent samples if we perform an experiment in two unrelated populations. We have
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationSolutions exercises of Chapter 7
Solutions exercises of Chapter 7 Exercise 1 a. These are paired samples: each pair of half plates will have about the same level of corrosion, so the result of polishing by the two brands of polish are
More informationIntroduction to hypothesis testing
Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationBusiness Statistics. Lecture 5: Confidence Intervals
Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in
More informationQuestions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.
Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized
More informationChapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides
Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationDistribution-Free Procedures (Devore Chapter Fifteen)
Distribution-Free Procedures (Devore Chapter Fifteen) MATH-5-01: Probability and Statistics II Spring 018 Contents 1 Nonparametric Hypothesis Tests 1 1.1 The Wilcoxon Rank Sum Test........... 1 1. Normal
More informationz and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests
z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba
More information1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests
Overall Overview INFOWO Statistics lecture S3: Hypothesis testing Peter de Waal Department of Information and Computing Sciences Faculty of Science, Universiteit Utrecht 1 Descriptive statistics 2 Scores
More information+ Specify 1 tail / 2 tail
Week 2: Null hypothesis Aeroplane seat designer wonders how wide to make the plane seats. He assumes population average hip size μ = 43.2cm Sample size n = 50 Question : Is the assumption μ = 43.2cm reasonable?
More informationRama Nada. -Ensherah Mokheemer. 1 P a g e
- 9 - Rama Nada -Ensherah Mokheemer - 1 P a g e Quick revision: Remember from the last lecture that chi square is an example of nonparametric test, other examples include Kruskal Wallis, Mann Whitney and
More informationPhysics 509: Non-Parametric Statistics and Correlation Testing
Physics 509: Non-Parametric Statistics and Correlation Testing Scott Oser Lecture #19 Physics 509 1 What is non-parametric statistics? Non-parametric statistics is the application of statistical tests
More informationChapter 18 Resampling and Nonparametric Approaches To Data
Chapter 18 Resampling and Nonparametric Approaches To Data 18.1 Inferences in children s story summaries (McConaughy, 1980): a. Analysis using Wilcoxon s rank-sum test: Younger Children Older Children
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationWilcoxon Test and Calculating Sample Sizes
Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33 Differences in the Means of Two Independent Groups When
More informationANOVA - analysis of variance - used to compare the means of several populations.
12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.
More informationData Analysis and Statistical Methods Statistics 651
y 1 2 3 4 5 6 7 x Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 32 Suhasini Subba Rao Previous lecture We are interested in whether a dependent
More informationResampling Methods. Lukas Meier
Resampling Methods Lukas Meier 20.01.2014 Introduction: Example Hail prevention (early 80s) Is a vaccination of clouds really reducing total energy? Data: Hail energy for n clouds (via radar image) Y i
More informationChapter 7 Comparison of two independent samples
Chapter 7 Comparison of two independent samples 7.1 Introduction Population 1 µ σ 1 1 N 1 Sample 1 y s 1 1 n 1 Population µ σ N Sample y s n 1, : population means 1, : population standard deviations N
More informationReview 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2
Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 26 (MWF) Tests and CI based on two proportions Suhasini Subba Rao Comparing proportions in
More informationViolating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test?
Violating the normal distribution assumption So what do you do if the data are not normal and you still need to perform a test? Remember, if your n is reasonably large, don t bother doing anything. Your
More informationBackground to Statistics
FACT SHEET Background to Statistics Introduction Statistics include a broad range of methods for manipulating, presenting and interpreting data. Professional scientists of all kinds need to be proficient
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More informationOne sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:
One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get
More informationModule 9: Nonparametric Statistics Statistics (OA3102)
Module 9: Nonparametric Statistics Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 15.1-15.6 Revision: 3-12 1 Goals for this Lecture
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Boxplots and standard deviations Suhasini Subba Rao Review of previous lecture In the previous lecture
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationStatistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Testing. Confidence intervals on mean. CL = x ± t * CL1- = exp
Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Lecture Notes 1 Confidence intervals on mean Normal Distribution CL = x ± t * 1-α 1- α,n-1 s n Log-Normal Distribution CL = exp 1-α CL1-
More informationHYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă
HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and
More informationT.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS
ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only
More informationComparing Means from Two-Sample
Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to
More informationChapter 6. Estimates and Sample Sizes
Chapter 6 Estimates and Sample Sizes Lesson 6-1/6-, Part 1 Estimating a Population Proportion This chapter begins the beginning of inferential statistics. There are two major applications of inferential
More informationDo not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13
C h a p t e r 13 Independent-Samples t Test and Mann- Whitney U Test 13.1 Introduction and Objectives This chapter continues the theme of hypothesis testing as an inferential statistical procedure. In
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More information22s:152 Applied Linear Regression. 1-way ANOVA visual:
22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis
More informationTwo-Sample Inferential Statistics
The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is
More informationPHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1
PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population
More informationStatistics: revision
NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers
More informationCh. 7. One sample hypothesis tests for µ and σ
Ch. 7. One sample hypothesis tests for µ and σ Prof. Tesler Math 18 Winter 2019 Prof. Tesler Ch. 7: One sample hypoth. tests for µ, σ Math 18 / Winter 2019 1 / 23 Introduction Data Consider the SAT math
More informationInference for Distributions Inference for the Mean of a Population
Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The
More informationt-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression
t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More informationAnalysis of variance (ANOVA) Comparing the means of more than two groups
Analysis of variance (ANOVA) Comparing the means of more than two groups Example: Cost of mating in male fruit flies Drosophila Treatments: place males with and without unmated (virgin) females Five treatments
More informationChapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.
Chapter 23 Inferences About Means Sampling Distributions of Means Now that we know how to create confidence intervals and test hypotheses about proportions, we do the same for means. Just as we did before,
More informationIntroduction to Nonparametric Statistics
Introduction to Nonparametric Statistics by James Bernhard Spring 2012 Parameters Parametric method Nonparametric method µ[x 2 X 1 ] paired t-test Wilcoxon signed rank test µ[x 1 ], µ[x 2 ] 2-sample t-test
More informationCorrelation and Regression
Correlation and Regression Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 Learning Objectives Upon successful completion of this module, the student should
More informationAnalysis of 2x2 Cross-Over Designs using T-Tests
Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous
More informationBusiness Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee
Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)
More informationPreliminary Statistics Lecture 5: Hypothesis Testing (Outline)
1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 5: Hypothesis Testing (Outline) Gujarati D. Basic Econometrics, Appendix A.8 Barrow M. Statistics
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationHypothesis Testing hypothesis testing approach formulation of the test statistic
Hypothesis Testing For the next few lectures, we re going to look at various test statistics that are formulated to allow us to test hypotheses in a variety of contexts: In all cases, the hypothesis testing
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationThings you always wanted to know about statistics but were afraid to ask
Things you always wanted to know about statistics but were afraid to ask Christoph Amma Felix Putze Design and Evaluation of Innovative User Interfaces 6.12.13 1/43 Overview In the last lecture, we learned
More information1 Least Squares Estimation - multiple regression.
Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1
More informationpsychological statistics
psychological statistics B Sc. Counselling Psychology 011 Admission onwards III SEMESTER COMPLEMENTARY COURSE UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION CALICUT UNIVERSITY.P.O., MALAPPURAM, KERALA,
More informationIntuitive Biostatistics: Choosing a statistical test
pagina 1 van 5 < BACK Intuitive Biostatistics: Choosing a statistical This is chapter 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc.
More informationNon-parametric methods
Eastern Mediterranean University Faculty of Medicine Biostatistics course Non-parametric methods March 4&7, 2016 Instructor: Dr. Nimet İlke Akçay (ilke.cetin@emu.edu.tr) Learning Objectives 1. Distinguish
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationNon-parametric (Distribution-free) approaches p188 CN
Week 1: Introduction to some nonparametric and computer intensive (re-sampling) approaches: the sign test, Wilcoxon tests and multi-sample extensions, Spearman s rank correlation; the Bootstrap. (ch14
More informationWELCOME! Lecture 13 Thommy Perlinger
Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y X 1 1 One dependent variable
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Review of previous lecture We showed if S n were a binomial random variable, where
More informationLECTURE 5. Introduction to Econometrics. Hypothesis testing
LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will
More informationBasics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.
Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/46 Overview 1 Basics on t-tests 2 Independent Sample t-tests 3 Single-Sample
More informationPooled Variance t Test
Pooled Variance t Test Tests means of independent populations having equal variances Parametric test procedure Assumptions Both populations are normally distributed If not normal, can be approximated by
More informationMAT Mathematics in Today's World
MAT 1000 Mathematics in Today's World Last Time 1. Three keys to summarize a collection of data: shape, center, spread. 2. Can measure spread with the fivenumber summary. 3. The five-number summary can
More informationHypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true
Hypothesis esting Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true Statistical Hypothesis: conjecture about a population parameter
More informationDescriptive Statistics CE 311S
CE 311S MEASURES OF LOCATION AND VARIABILITY As a starting point, we need a way to briefly summarize an entire sample with simple numerical values. This is the realm of descriptive statistics. For now,
More informationContrasts and Multiple Comparisons Supplement for Pages
Contrasts and Multiple Comparisons Supplement for Pages 302-323 Brian Habing University of South Carolina Last Updated: July 20, 2001 The F-test from the ANOVA table allows us to test the null hypothesis
More informationSampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =
2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result
More informationPopulation Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8
Concepts from previous lectures HUMBEHV 3HB3 one-sample t-tests Week 8 Prof. Patrick Bennett sampling distributions - sampling error - standard error of the mean - degrees-of-freedom Null and alternative/research
More informationData Analysis and Statistical Methods Statistics 651
Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 9 (MWF) Calculations for the normal distribution Suhasini Subba Rao Evaluating probabilities
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationPower and nonparametric methods Basic statistics for experimental researchersrs 2017
Faculty of Health Sciences Outline Power and nonparametric methods Basic statistics for experimental researchersrs 2017 Statistical power Julie Lyng Forman Department of Biostatistics, University of Copenhagen
More informationCHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC
CHI SQUARE ANALYSIS I N T R O D U C T I O N T O N O N - P A R A M E T R I C A N A L Y S E S HYPOTHESIS TESTS SO FAR We ve discussed One-sample t-test Dependent Sample t-tests Independent Samples t-tests
More informationChapter 6 The Standard Deviation as a Ruler and the Normal Model
Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread
More informationStatistics Handbook. All statistical tables were computed by the author.
Statistics Handbook Contents Page Wilcoxon rank-sum test (Mann-Whitney equivalent) Wilcoxon matched-pairs test 3 Normal Distribution 4 Z-test Related samples t-test 5 Unrelated samples t-test 6 Variance
More informationWe're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).
Sampling distributions and estimation. 1) A brief review of distributions: We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation,
More information2011 Pearson Education, Inc
Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:
More informationNonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.
Nonparametric s Mark Muldoon School of Mathematics, University of Manchester Mark Muldoon, November 8, 2005 Nonparametric s - p. 1/31 Overview The sign, motivation The Mann-Whitney Larger Larger, in pictures
More informationMORE ON MULTIPLE REGRESSION
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 816 MORE ON MULTIPLE REGRESSION I. AGENDA: A. Multiple regression 1. Categorical variables with more than two categories 2. Interaction
More informationDealing with the assumption of independence between samples - introducing the paired design.
Dealing with the assumption of independence between samples - introducing the paired design. a) Suppose you deliberately collect one sample and measure something. Then you collect another sample in such
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationData analysis and Geostatistics - lecture VII
Data analysis and Geostatistics - lecture VII t-tests, ANOVA and goodness-of-fit Statistical testing - significance of r Testing the significance of the correlation coefficient: t = r n - 2 1 - r 2 with
More informationPSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests
PSY 307 Statistics for the Behavioral Sciences Chapter 20 Tests for Ranked Data, Choosing Statistical Tests What To Do with Non-normal Distributions Tranformations (pg 382): The shape of the distribution
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More information