Hypothesis Testing. Mean (SDM)

Similar documents
CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

HYPOTHESIS TESTING. Hypothesis Testing

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Two-Sample Inferential Statistics

16.400/453J Human Factors Engineering. Design of Experiments II

Econ 325: Introduction to Empirical Economics

Ch. 7: Estimates and Sample Sizes

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Chapter 8: Confidence Intervals

T-Test QUESTION T-TEST GROUPS = sex(1 2) /MISSING = ANALYSIS /VARIABLES = quiz1 quiz2 quiz3 quiz4 quiz5 final total /CRITERIA = CI(.95).

Sampling Distributions

Ch. 7. One sample hypothesis tests for µ and σ

20 Hypothesis Testing, Part I

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

CENTRAL LIMIT THEOREM (CLT)

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Confidence intervals

Lecture on Null Hypothesis Testing & Temporal Correlation

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 23. Inference About Means

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

The t-statistic. Student s t Test

Hypotheses and Errors

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Preliminary Statistics Lecture 5: Hypothesis Testing (Outline)

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Data analysis and Geostatistics - lecture VI

Chapter 6. Estimates and Sample Sizes

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

Exam 2 (KEY) July 20, 2009

Process Watch: Having Confidence in Your Confidence Level

Single Sample Means. SOCY601 Alan Neustadtl

Descriptive Statistics-I. Dr Mahmoud Alhussami

Applied Statistics for the Behavioral Sciences

PhysicsAndMathsTutor.com

Confidence Intervals. - simply, an interval for which we have a certain confidence.

Two Sample Problems. Two sample problems

The simple linear regression model discussed in Chapter 13 was written as

An inferential procedure to use sample data to understand a population Procedures

Statistics for IT Managers

Review of Statistics 101

Slides for Data Mining by I. H. Witten and E. Frank

Statistical Foundations:

Probability and Discrete Distributions

Samples and Populations Confidence Intervals Hypotheses One-sided vs. two-sided Statistical Significance Error Types. Statistiek I.

5.2 Tests of Significance

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

The Difference in Proportions Test

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Analysis of 2x2 Cross-Over Designs using T-Tests

Dealing with the assumption of independence between samples - introducing the paired design.

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Factorial Independent Samples ANOVA

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

AP Statistics Ch 12 Inference for Proportions

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Independent Samples ANOVA

Chapter 2 Descriptive Statistics

79 Wyner Math Academy I Spring 2016

Business Statistics. Lecture 5: Confidence Intervals

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

Chapter Six: Two Independent Samples Methods 1/51

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

CHAPTER 7. Hypothesis Testing

Contrasts (in general)

Binary Logistic Regression

Tutorial 2: Power and Sample Size for the Paired Sample t-test

Chapter 7: Hypothesis Testing - Solutions

Mathematical Notation Math Introduction to Applied Statistics

Hypothesis Testing in Action: t-tests

23. MORE HYPOTHESIS TESTING

Using SPSS for One Way Analysis of Variance

Sampling Distributions: Central Limit Theorem

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size?

10.4 Hypothesis Testing: Two Independent Samples Proportion

A proportion is the fraction of individuals having a particular attribute. Can range from 0 to 1!

We need to define some concepts that are used in experiments.

ST505/S697R: Fall Homework 2 Solution.

AP Statistics Cumulative AP Exam Study Guide

The Student s t Distribution

Multiple Comparisons

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Testing Research and Statistical Hypotheses

Chapter 20 Comparing Groups

Hypothesis testing. Data to decisions

Inference with Simple Regression

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Lab 5 for Math 17: Sampling Distributions and Applications

Transcription:

Confidence Intervals and Hypothesis Testing Readings: Howell, Ch. 4, 7 The Sampling Distribution of the Mean (SDM) Derivation - See Thorne & Giesen (T&G), pp. 169-171 or online Chapter Overview for Ch. 9 http://www.mhhe.com/thorne4 See also Howell, Chapter 4.2, 7.1

Getting to Know You Name, address, and phone number? With distributions, it s Central tendency The mean of the sampling distribution of M Variability The variance of the sampling distribution of M The standard deviation of the sampling distribution of M Shape Properties of the Sampling Properties of the Sampling Distribution of the Mean (SDM):

The Sampling Distribution of the Mean: How is it constructed? The sampling distribution of the mean (SDM) is constructed by drawing samples of some fixed size (say N = 10), calculating the mean of the samples, repeating the process for a large number of samples, and plotting the resulting means in a frequency polygon. The resulting distribution is the sampling distribution of the mean. Property 1 The population mean of the SDM equals the mean of the Parent Population. In symbols:

Property 2 The larger the size of the sample (N ) taken when constructing the SDM, the more closely the SDM will approximate a normal curve. If the parent population is normal, the SDM will be exactly normal. Property 3 The larger the size of the sample (N), the smaller the standard deviation of the SDM. The standard deviation of the sampling distribution of the mean is called the standard error of the mean. In symbols:

AKA The Central Limit Theorem Howell says, This is one of the most important theorems in statistics. (7.1) His version, p. 180: Also see this section for his illustration of the shape property. Rice Virtual Statistics i LbSi Lab Simulation i Forming Any z Score z = (score - mean of scores)/st dev of scores z (X) = z (M) = Finding M for a z - Formula 9-3 (T&G); Howell, 7.3 in context of Conf. Intervals.

Estimation and Degrees of Freedom Estimation for μ Estimation of sigma-squared or sigma N-1 in that case was the degrees of freedom General definition for degrees of freedom the number of values that are free to vary after certain restrictions have been placed on the data. Given that 5 scores must sum to zero, let s try it. See Howell, p. 187 for a more specific rationale. Estimating the Standard Error of the Mean Standard Error of the Mean formula: Estimated Standard Error of the Mean formula: Formula for z score: Formula when we estimate the standard error of the mean:

Summary Parameter Name Symbol Equivalence Sample Estimate Name Symbol Equivalence Population Mean of the SDM Sample Mean of the SDM Population Variance of the SDM Sample Variance of the SDM Population Standard Deviation of the SDM* Sample standard error of the mean The t distribution- A little history Ireland, the Guinness brewing company, and William Sealy Gosset t distribution developed to be used with small samples and unknown population variances Student st t -- rather than Gosset s s t-- so that the company wouldn t have to admit to a bad batch. More

t and z -- What s the difference? Sketch Role of the df Formula See Howell, Figure 7.5, p. 187. The Confidence Interval Finding deviant mean scores Sketch: Note The horizontal axis is the X axis\ The deviant scores are now means Formula (where M = X-bar) 95% CI = +/- t(.05/2) s(m) + M 99% CI = +/- t(.01/2) s(m) + M See also, Howell, 7.3, Confidence Interval on µ

Example: Checking Progress Solution Interpretations We can be 95% confident that the average digit span of the adult population p is at least 6.32 and at most 7.90 digits. We can be 99% confident that the average digit span of the adult population is at least 6.05 and at most 8.17 digits. Sketches

What the Confidence Interval Really Means Confidence -- not probability The sample has already been drawn For a 95% confidence interval, if the experiment were repeated 100 times, 95 of the intervals would contain (capture) the true population mean μ. μ The rub is, for any given one, we don t know forsureifitcapturesμ it μ or not, hence the confidence language. Confidence vs. Probablilty We place our conficence in the method rather than the interval. You can say your confidence is.95 that Mu is found in the interval. You re at risk if you say that the probability is.95 that Mu lies in the interval. Many will pounce if you do. See Howell,,p. 193.

Hypothesis Testing: One-Sample t Test Example Handout Ratings of Speech http://www2.msstate.edu/~jmg1/3103/one_s ample_ttest.pdf

Hypothesis Testing: One-Sample t Test How deviant or unlikely is a particular sample mean? Analogous to finding the deviant probability of a score with the z curve Start from knowledge or assumption of the parent population. We know from much previous intelligence research that the population p mean adult digit span is 7.0 (μ = 7). Another Example Research Context Does amphetamine (a stimulant) affect recent memory? 25 volunteers take a small dose then take the digit span test M = 7.53 and s = 0.97 What is the probability of a mean this deviant? Outline: M --> >t --> Area (or probability)

Set up 7-step procedure Solution We assumed our sample came from the adult population with average digit span of 7. Our sample mean was 7.53, and this was 2.73 standard error units away from the mean. S(M) = s/sqrt 0.97 = 0.194 Tcal = (7.53 7.00)/0.194 = 2.73 Tcritcal = t (25,.025) = 2.0639 (look-up) t(calculated) = 2.73, and the probability of a score this or more deviant was less than.05. A sample mean this deviant is very unlikely to have come from a population where μ = 7. Conclusion A one-sample t test was performed to determine if amphetamine affects adult digit span. The digit span for the amphetamine sample (M = 7.53) was significantly different from the normal adult digit span of 7, t(24) = 2.73,,p p <.05 (p <.02).) Amphetamine significantly enhances recent memory.

The Seven-Step Procedure An easy, organized way to test hypotheses and come to a conclusion. See T&G, pp. 190-191. mhhe.com/thorne4 Chapter 9 Review Gunsmoke: Errors in Hypothesis Testing Reality Cheating Decision Not Cheating H 0 : BB Honest H A : BB Cheating Error I: Shot Innocent Man Consequences? Correct Correct Error II: Continue to Play Consequences?

Errors in Hypothesis Testing Reality Decision Reject Fail to Reject (Claim an Effect) (No Claim) H 0 :True H 0 : False Type I Error Alpha Error (false claim) Consequences? Correct Correct Type II Error Beta Error (failure of detection) Consequences? Errors in Hypothesis Testing Signal Detection Theory Version Reality H 0 : True (No Sign.) H 0 : False (Yes Signal present) Is Signal Present? Decision Yes (Reject) (Claim an Sig. Present) Type I, Alpha Error Alpha Error (false positive false alarm Consequences? Correct Hit No Signal Fail to Reject (No Claim) Correct (Correct Rejection) Type II, β Error (False negative Miss ) (failure of detection) Consequences?

Consequences of Errors False Positive (false claim, Type I error) Gunsmoke: If you shoot an innocent man, you ll hang Death penalty trials avoid conviction c o of innocents. Evidence must by very very good. Hunters in the woods must not shoot everything that moves in the bushes. It could be a person or dog. Threshold for pulling the trigger must be raised until the signal is clearly present. In these situations, really want to avoid false claims. False negatives (failures of detection) not that bad Gunsmoke: Continue to play & loose $. Some criminals get lesser sentence or go free The hunter misses their game once in a while. False positives less harmful than False negatives Blood bank screening for AIDS (HIV) virus Test very sensitive; tends to detect HIV sometimes when it is not there (false positives; Type I errors) )but t always detect twhen it is there. Thus we eliminate failures of detection ( Misses ) ). Shoot and ask questions later. Cost of false positives some clean blood samples are wasted; a small price to pay for knowing gyou have no infected blood in YOUR transfusion!

The Almighty d prime Abiasfree bias-free measure of test sensitivity (under construction) A bias for claiming an effect ( shoot anything that moves ) will result in more false positives A bias for holding back on claiming a effect ( be sure before you shoot ) will result in more false negatives (Misses). d is the hit rate H the false alarm rate F when H and F are z-transformed: d = z(h) z(f), where H = P("yes" YES) and F = P("yes" NO) Colin Wilson has provided his Excel formula: d' = NORMINV(hitrate,0,1) - NORMINV(false-alarm-rate,0,1) where Excel's NORMINV "Returns the inverse of the normal cumulative distribution for the specified mean and standard deviation", 0 being the specified mean and 1 being the specified SD. See this link: http://www.linguistics.ucla.edu/faciliti/facilities/statistics/dprime.htm i i l /f ili i/f ili i / i i /d i h http://wise.cgu.edu/sdtmod/signal_applet.asp Power-Definitions Beta = Probability of a Type II error Probability of failure of detection Power = 1 - Beta Power = 1 - Probability of a failure of detection Power = Probability of detection of an effect

Why Power How hard would you work if the chance of getting paid =.20,.50,.80? In behavioral science research we want a good chance of ffinding an effect, if it is present (e.g., a cure for the common cold or a form of cancer, confirming YOUR thesis/dissertation hypothesis). Statistical power tells us our chance of finding an effect the Probability of detection of an effect Factors Affecting Power Alpha level Increasing the alpha level the level of power. Sample Size Increasing the sample size the level of power. Size of the effect (mu0 - mu1) The larger the effect, the the level of power.

Power Diagrams Text illustrations - Howell, 4.7 Note best shown in two curves: one curve for H 0 shows α and 1- α; when H0 is true Another curve for H 1 true shows β and 1- β See Howell s Figure 4.3 Spreadsheet illustration Power.xls What Should We Be Looking For? What statistical significance tells us What effect size tells us Practical significance Statistical ti ti significance ifi = statistical ti ti reliability How sure we are of the presence of some effect Practical significance ifi = effect size = importance! More Later.

Using SPSS One sample t test Confidence Interval Example from Ratings of Speech (enter 4.40 x 33 times, 0.51, 8.29. This simulates data for example.) Enter data; Use this syntax (or point-click) T-TEST /TESTVAL = 4 /VARIABLES = Ratings. * Comment: Set Testval = 0 to get CI on the mean. T-TEST /TESTVAL = 0 /VARIABLES = Ratings. One-Sample Statistics Ratings N Mean Std. Deviation Std. Error Mean 35 4.4000.94395.15956 One-Sample Test Ratings Test Value = 4 95% Confidence Interval of the Difference Mean t df Sig. (2-tailed) Difference Lower Upper 2.507 34.017.40000.0757.7243 * Comment: Set Testval = 0togetCIonthemean. on the T-TEST /TESTVAL = 0 /VARIABLES = Ratings. One-Sample Test Ratings Test Value = 0 95% Confidence Interval of the Difference Mean t df Sig. (2-tailed) (2tild) Difference Lower Upper 27.576 34.000 4.40000 4.0757 4.7243

Lab Assignment 1 Review questions and formulae Formulae for #2. Help on setup for #5. SPSS startup