Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Similar documents
Confidence Intervals. - simply, an interval for which we have a certain confidence.

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Confidence intervals

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Descriptive Statistics (And a little bit on rounding and significant digits)

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).

Ch. 7: Estimates and Sample Sizes

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Chapter 1 Review of Equations and Inequalities

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Algebra Year 10. Language

Math Lecture 3 Notes

Chapter 23. Inference About Means

Introduction to Algebra: The First Week

STEP Support Programme. Hints and Partial Solutions for Assignment 1

Take the Anxiety Out of Word Problems

Finding Limits Graphically and Numerically

3: Linear Systems. Examples. [1.] Solve. The first equation is in blue; the second is in red. Here's the graph: The solution is ( 0.8,3.4 ).

Ch. 7. One sample hypothesis tests for µ and σ

2. l = 7 ft w = 4 ft h = 2.8 ft V = Find the Area of a trapezoid when the bases and height are given. Formula is A = B = 21 b = 11 h = 3 A=

#29: Logarithm review May 16, 2009

Please bring the task to your first physics lesson and hand it to the teacher.

Stat 20 Midterm 1 Review

Quadratic Equations Part I

Solving with Absolute Value

A. Incorrect! Replacing is not a method for solving systems of equations.

Direct Proofs. the product of two consecutive integers plus the larger of the two integers

MATH240: Linear Algebra Review for exam #1 6/10/2015 Page 1

Chapter 9: Roots and Irrational Numbers

Chapter 5 Simplifying Formulas and Solving Equations

Section 2.7 Solving Linear Inequalities

Chapter 3 ALGEBRA. Overview. Algebra. 3.1 Linear Equations and Applications 3.2 More Linear Equations 3.3 Equations with Exponents. Section 3.

Conceptual Explanations: Simultaneous Equations Distance, rate, and time

Math 2 Variable Manipulation Part 6 System of Equations

Categorical Data Analysis. The data are often just counts of how many things each category has.

ASTRO 114 Lecture Okay. What we re going to discuss today are what we call radiation laws. We ve

C if U can. Algebra. Name

Chapter 1: January 26 January 30

Math 111, Introduction to the Calculus, Fall 2011 Midterm I Practice Exam 1 Solutions

MITOCW R11. Double Pendulum System

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000.

MITOCW ocw f99-lec01_300k

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Chapter 26: Comparing Counts (Chi Square)

5.2 Infinite Series Brian E. Veitch

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

Lesson 21 Not So Dramatic Quadratics

Algebra & Trig Review

Introduction. So, why did I even bother to write this?

Business Statistics. Lecture 5: Confidence Intervals

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test?

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Properties of Arithmetic

Lecture 17: Small-Sample Inferences for Normal Populations. Confidence intervals for µ when σ is unknown

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Dealing with the assumption of independence between samples - introducing the paired design.

Lecture 10: Powers of Matrices, Difference Equations

Section 5.4. Ken Ueda

Solving Quadratic & Higher Degree Equations

Math 308 Midterm Answers and Comments July 18, Part A. Short answer questions

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Chapter 8: Confidence Intervals

The Normal Table: Read it, Use it

Solving Quadratic & Higher Degree Equations

Last Time. x + 3y = 6 x + 2y = 1. x + 3y = 6 y = 1. 2x + 4y = 8 x 2y = 1. x + 3y = 6 2x y = 7. Lecture 2

Math Fundamentals for Statistics I (Math 52) Unit 7: Connections (Graphs, Equations and Inequalities)

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size.

1 Normal Distribution.

CHAPTER 7: TECHNIQUES OF INTEGRATION

Chapter 27 Summary Inferences for Regression

Study skills for mathematicians

LECTURE 15: SIMPLE LINEAR REGRESSION I

6: Polynomials and Polynomial Functions

Getting Started with Communications Engineering

Business Statistics. Lecture 10: Course Review

Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1

Algebra Year 9. Language

Sums of Squares (FNS 195-S) Fall 2014

Section 2.6 Solving Linear Inequalities

1 Limits and continuity

MITOCW ocw f99-lec05_300k

Vectors. A vector is usually denoted in bold, like vector a, or sometimes it is denoted a, or many other deviations exist in various text books.

Business Statistics. Lecture 9: Simple Regression

9. TRANSFORMING TOOL #2 (the Multiplication Property of Equality)

Biostatistics: Correlations

Math 5a Reading Assignments for Sections

Mathematical Notation Math Introduction to Applied Statistics

Lesson 8: Velocity. Displacement & Time

In economics, the amount of a good x demanded is a function of the price of that good. In other words,

Are data normally normally distributed?

Hi, my name is Dr. Ann Weaver of Argosy University. This WebEx is about something in statistics called z-

Getting Started with Communications Engineering. Rows first, columns second. Remember that. R then C. 1

Central Limit Theorem Confidence Intervals Worked example #6. July 24, 2017

Vector, Matrix, and Tensor Derivatives

Transcription:

Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something we re interested in. A more specific example: We are 90% certain that the interval 66 to 75 inches contains the true mean (μ) for height for men (numbers made up). Note that we are looking at the true mean, μ, not y. Usually we want confidence intervals for means, but we can also construct them for variances, medians, and other quantities. Some of these can be quite useful. The problem for us is that we need to find the endpoints of our interval (a, b). This is what this section of notes is all about how do we find (a, b)? Let's use a neat example from the text as an analogy: An invisible man walking a dog We might not really be interested in the dog (well, some people might be), but rather in the man. We can see the dog - this is our y, or sample mean. We can t see our man - this is our population mean (μ). But, we know that the dog spends most of his time near the man - in fact, the further away from the man the dog gets, the less time he spends. 2. Some more properties of the normal curve. If we draw a curve for amount of time spent away from the man, it ll be a normal curve, peaking where the man is. We use the dog to estimate where our man is, and we use y to estimate where μ is. If we use the above two facts, we can construct an interval for μ. (We use the normal curve to try and figure out where μ actually is - actually, we'll wind up using the t-distribution, but more on that soon). We already discussed reverse lookup. In this case we're interested in reasonable numbers like 90%, 95%, etc.

Since we re interested in numbers like 90%, 95%, 99%, etc., we need to look these up in our normal tables. For instance Pr{Z < z}=.90 => z = 1.28 Pr{Z < z}=.95 => z = 1.65 Pr{Z < z}=.99 => z = 2.33 (Remember, the symbol => means implies ) But this time we re interested in an interval (for instance, the middle 90%), so we really want: Pr{z 1 < Z < z 2 } =.90 => z 1 = -1.65 and z 2 = 1.65 Note: we look up the value for Pr{Z < z}=.95, and then take advantage of symmetry. If, for example, we want 90% of the area in the middle of our curve, that means each end of our curve has 5% that we don t want. So we look up the value for 95%, and then just add a negative sign for the lower part of our curve.

Similarly, we get: Pr{z 1 < Z < z 2 } =.95 => z 1 = -1.96 and z 2 = 1.96 Pr{z 1 < Z < z 2 } =.99 => z 1 = -2.58 and z 2 = 2.58 3. Constructing confidence intervals. So where are we? (Notice that the values 1.64, 1.96, and 2.58 will come up a lot!) Well, suppose we took a sample and calculated y. What we want to figure out is the probability that a certain interval around y includes μ. Or, putting it another way, we d like something like: Pr{y 1 < Y < y 2 } =.90, where y 1 and y 2 are such that they would have a 90% chance of including the true value for μ. But we know how to do this! (It s just not obvious yet). Here s how, to do it for Z (following the example on p. 190 [186] {177}): Pr{-1.65 < Z < 1.65} = 0.90 We just did this above. We also know that the following has a standard normal distribution: Z = Y / n So we can now substitute this for Z in the first equation and then isolate μ: Ȳ μ Pr { 1.65 < σ / n } < 1.65 = 0.90 = Pr { 1.65 σ n < Ȳ μ < 1.65 σ n } = 0.90 multiplying by σ n 1.65 σ = Pr { Ȳ n < μ < Ȳ + 1.65 σ n } = 0.90 subtracting Ȳ = Pr {Ȳ 1.65 σ < μ < Ȳ + 1.65 σ n n } = 0.90 see note in text below. (Comment: in the last step we multiplied by -1, and reversed the inequalities; then we re-arranged things within the probability symbol (so it looks a little like we never reversed the inequalities)).

And we finally get: y±1.96 n Which will give us the endpoints for an interval that will contain μ with 90% confidence. Problem: we DO NOT know σ. Well, substitute s for σ. Unfortunately, this means that our Z is no longer normally distributed; in other words, we can t use our normal tables. Why? Here's a partial explanation. Z has a normal distribution and is equal to the following: Z = Ȳ μ σ/ n Everything in this equation (except Y ) is a constant. Y has a normal distribution, so Z has a normal distribution. But if we substitute s for σ, we have: Z = Ȳ μ s/ n But s is NOT a constant, it's a random variable (like Y ) s has it's own distribution. And it's not normally distributed. This means Z does not have a normal distribution. Or, in other words: Z Ȳ μ s/ n So was all the math above useless? No, or we wouldn't have done it. Introducing the t-distribution: Without delving deep into the theory, the above quantity has a t-distribution: t = Ȳ μ s/ n The t-distribution looks a lot like the normal, but generally has fatter tails:

[Incidentally, notice that as ν, the t-distribution the normal distribution.] The t-distribution has only one parameter, the greek nu or ν (NOT μ). Nu (ν ) represents the degrees of freedom. Historical note: the t-distribution was first derived by W.S. Gosset, while working for the Guiness brewery. Guiness did not allow him to use his own name in publishing, so he published under the pseudonym Student. The Guinness brewing company now has a plaque near its entrance commemorating Gosset. Because it has fatter tails, the t-distribution will require bigger numbers for the same probability. What does this mean?? Suppose you want the middle 90% of a normal distribution. The cut-offs (the z- values) are given above and are -1.65 and 1.65, or in terms of probability: Pr{z 1 < Z < z 2 } =.90 => z 1 = -1.65 and z 2 = 1.65 But now if we use a t-distribution (assuming ν = 10), and we want the middle 90%, we have:: Pr{t 1 < T < t 2 } =.90 => t 1 = -1.812 and t 2 = 1.812

Notice that to get the same probability (.90), we need to use bigger numbers (in terms of absolute value) for t. This also means that our confidence interval will be bigger using the t-distribution than using the normal distribution. But remember, we have to use the t-distribution (at least unless n is really large). Let's give some examples (see figure 6.7 on p. 191 [6.7, p. 187] {6.3.2, p. 178}. [illustrate]). See table 4 in your book (or the inside back cover). This is arranged backwards from your normal tables. In other words, the numbers for the areas in the main part of the normal table are now along the top, and the numbers from the margins of the normal table are now inside the table. Why? because with the t-distribution we re hardly every interested in anything except the values for 90%, 95%, 99%, etc. Also, note that the values are for the upper tail, NOT for the area below the tail (opposite to how your normal table is tabulated). So instead of 0.90, it shows 0.10, instead of 0.95, it shows 0.05, etc. Finally, we should probably fix the t-tables to make them less confusing (though the reason we're doing this may not be obvious yet): To the bottom of your t-tables, add the following numbers (one for each column - note that each number is exactly double the number on the top):.40.20.10.08.06.05.04.02.01.001 Then add, TWO TAILED PROBABILITY just below these numbers. CAUTION: some of the description in the text doesn t match the table, so be careful. 4) Finally, let's actually calculate some confidence intervals. Here's how to calculate, for example, a 95 % CI: y t 95%, ν SE y and y + t 95%, ν SE y or s y ± t 95 %,ν SE which is the same as y ± t 95 %, ν n Where the critical t-value is from table 4, with ν = n-1 = degrees of freedom.

Concerning subscripts: t above has a subscript of 95%. This indicates that we want the value of t which puts 2.5 % of the area into each tail. t requires another subscript indicating the degrees of freedom. Remember, there are an infinite number of t-distributions, so knowing the degrees of freedom is important. Your text does things a little differently here. The text always uses the one sided probability of the upper tail for the subscript, which might be a bit confusing. But, this is generally a much better way to do things. Still, for now it's easier to stick with % in the subscript. (Your book would use t.025,ν instead of t 95%,ν ) (To do it the book way, consider that the area outside of 95% is equal to 5%. Then divide 5% in half to get 2.5% for each tail. Finally, remember that 2.5% is the same as a probability of 0.25). So here is a specific example, exercise 6.9, p. 198 [6.10, p. 194] {6.3.3, p. 185}. We are given that the sample mean is 31.7, the sample standard deviation is 8.7, and the sample size (n) is 5: Part (b) from the exercise asks us to construct a 90% confidence interval. y±t 4,90 % Two comments: s n = 31.7±t 8.7 4,90 % = 31.7± 2.132 3.891 = 31.7±8.295 5 which gives us: 23.4, 40.0 1) Always finish the calculation. 31.7 ± 8.295 is NOT an interval. 2) We're calculating actual mathematical intervals. Always put the smaller number first. For example, (40.0, 23.4) is NOT an interval. We can also calculate a 95% confidence interval: We follow the same procedure as in (b), but now our critical value for t becomes 2.776. In other words, t 95%,4 = 2.776, and we get a confidence interval of: (20.9, 42.5)

See also examples 6.6 and 6.7 [6.6 & 6.7] {6.3.1 and 6.3.2 (different in 4th edition, but the same type of examples)} in your text. For now, don t pay attention to the graphs - we ll get to that stuff soon. 5. Comments on confidence intervals. Confidence interval gets bigger if we want to be more certain. Where would the ends be if we wanted to be 100% certain?? Exactly what does the CI tell us? It tells us how confident we are that the limits of our CI contains the true value of μ. Here's another way of looking at CI's: What does the CI not tell us? If we construct 100 95% CI s 95 of our CI s (on average) will contain the true value of μ. Similarly, if we construct 100 90% CI's, than 90 of them (on average) will contain the true value of μ. Your text has a nice picture that shows this on p. 195 [p. 191] {p. 182}. Pr{20.9 < μ < 42.5} = 0.95 This is absolutely WRONG. μ is a constant, and so are obviously the endpoints of our interval. The statement above is then either correct or incorrect, never anything in between. The probability is either 0 or 1. For instance, suppose we know that μ = 43.1. What happens to the above statement??? Remember: μ is a constant, not a random variable ( Y, on the other hand, is a random variable). Let's do another example: 6.9, p. 196 [6.9, p. 192] {6.3.4, p. 183}: Bone mineral density. 94 women took medication to prevent bone mineral loss (hormone replacement therapy (very controversial these days)). After the study, we find that y = 0.878 gm/cm 3 (the book erroneously uses 2 which still hasn't been fixed in the 4 th edition). We also find that s = 0.126 gm/cm 3. So we get SE y = 0.013 gm/cm 3.

Problem: how many degrees of freedom do we use? There is no row for d.f. = 93. We always use the row with less d.f. than we have. In this case we use 80 df (otherwise our CI would be too small (too big is okay, too small is bad)). Your text is wrong here; don't round to the nearest number, always round down. so t 95%,80 = 1.990 Constructing a CI, gives us 0.879 ± 1.990 (.013) And finally we get: (0.852, 0.904) Again, what does this mean? We are 95% confident that the true average hip bone mineral density for women aged 45 to 64 taking this medication is between 0.852 gm/cm 3 and 0.904 gm/cm 3. 6. Some optional material Suppose you want a confidence interval of a certain size. It is actually possible to do this: the bigger your sample size, the smaller your CI. Section 6.4 in your book (all editions) discusses how to determine sample size It s not an ideal method because you need to guess at the Standard Deviation (and sometimes you just don t have a clue). Here's the basic idea. Make a guess for the SD ( = s), and remember that SE = SD n Then plug in this number (your guess) for the SD, use the SE that you want (which helps determines the CI), and solve for n: n = SD 2 SE See example 6.11 [6.11] in text. We won t go into the details, though we might get back to this later.