value of the sum standard units

Similar documents
*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Chapter 1 Review of Equations and Inequalities

MITOCW ocw f99-lec30_300k

THE SIMPLE PROOF OF GOLDBACH'S CONJECTURE. by Miles Mathis

MITOCW ocw f99-lec09_300k

Lesson 6: Algebra. Chapter 2, Video 1: "Variables"

MITOCW ocw f99-lec17_300k

Note: Please use the actual date you accessed this material in your citation.

Take the Anxiety Out of Word Problems

Dialog on Simple Derivatives

Business Statistics 41000: Homework # 5

QUADRATICS 3.2 Breaking Symmetry: Factoring

Instructor (Brad Osgood)

MITOCW ocw f99-lec05_300k

MITOCW ocw f99-lec23_300k

Basic Probability. Introduction

The Central Limit Theorem

Examples of frequentist probability include games of chance, sample surveys, and randomized experiments. We will focus on frequentist probability sinc

MAT Mathematics in Today's World

The Cycloid. and the Kinematic Circumference. by Miles Mathis

The Inductive Proof Template

GAP CLOSING. Algebraic Expressions. Intermediate / Senior Facilitator s Guide

Sampling Distribution Models. Chapter 17

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

MITOCW watch?v=pqkyqu11eta

MITOCW Lec 15 MIT 6.042J Mathematics for Computer Science, Fall 2010

MITOCW ocw f99-lec01_300k

COMP6053 lecture: Sampling and the central limit theorem. Jason Noble,

But, there is always a certain amount of mystery that hangs around it. People scratch their heads and can't figure

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

MITOCW MITRES18_005S10_DerivOfSinXCosX_300k_512kb-mp4

Physical Geography Lab Activity #15

MITOCW ocw f07-lec37_300k

MITOCW watch?v=ed_xr1bzuqs

PRACTICE PROBLEMS FOR EXAM 1

Chapter 18. Sampling Distribution Models /51

STA Why Sampling? Module 6 The Sampling Distributions. Module Objectives

COMP6053 lecture: Sampling and the central limit theorem. Markus Brede,

Chapter 8: An Introduction to Probability and Statistics

Fibonacci mod k. In this section, we examine the question of which terms of the Fibonacci sequence have a given divisor k.

Probability Year 9. Terminology

MITOCW watch?v=rf5sefhttwo

MITOCW Investigation 4, Part 3

P (A) = P (B) = P (C) = P (D) =

Algebra Year 9. Language

Discrete Probability. Chemistry & Physics. Medicine

Statistics 100 Exam 2 March 8, 2017

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

MITOCW MITRES_18-007_Part1_lec3_300k.mp4

The general topic for today is going to be oscillations, which are extremely important in the applications and in

Probability Year 10. Terminology

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Algebra Year 10. Language

September 12, Math Analysis Ch 1 Review Solutions. #1. 8x + 10 = 4x 30 4x 4x 4x + 10 = x = x = 10.

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

MITOCW Investigation 3, Part 1

MITOCW 6. Standing Waves Part I

3.2 Probability Rules

MITOCW watch?v=vu_of9tcjaa

Introduction to Proofs

Note: Please use the actual date you accessed this material in your citation.

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

The First Derivative Test

Hi, I'm Jocelyn, and we're going to go over Fall 2009, Exam 1, problem number 2.

Stat 101: Lecture 12. Summer 2006

MITOCW 5. Traveling Waves without Damping

Math Review -- Conceptual Solutions

MITOCW MITRES18_005S10_DiffEqnsGrowth_300k_512kb-mp4

EQ: How do I convert between standard form and scientific notation?

PROFESSOR: WELCOME BACK TO THE LAST LECTURE OF THE SEMESTER. PLANNING TO DO TODAY WAS FINISH THE BOOK. FINISH SECTION 6.5

Introducing Proof 1. hsn.uk.net. Contents

Chapter 14: Finding the Equilibrium Solution and Exploring the Nature of the Equilibration Process

To: Amanda From: Daddy Date: 2004 February 19 About: How to solve math problems

Fog Chamber Testing the Label: Photo of Fog. Joshua Gutwill 10/29/1999

Natural Language Processing Prof. Pawan Goyal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

MITOCW ocw f99-lec16_300k

Complex Matrix Transformations

California State Science Fair

Steve Smith Tuition: Maths Notes

Solution to Proof Questions from September 1st

STA Module 4 Probability Concepts. Rev.F08 1

6: Polynomials and Polynomial Functions

Descriptive Statistics (And a little bit on rounding and significant digits)

What It Feels like to Be in a Superposition

A Better Way to Do R&R Studies

Men. Women. Men. Men. Women. Women

Lecture 12: Quality Control I: Control of Location

Senior Math Circles November 19, 2008 Probability II

Math Fundamentals for Statistics I (Math 52) Unit 7: Connections (Graphs, Equations and Inequalities)

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

CSCI2244-Randomness and Computation First Exam with Solutions

MITOCW MIT18_02SCF10Rec_61_300k

Mathematical Logic Part One

1 Modular Arithmetic Grade Level/Prerequisites: Time: Materials: Preparation: Objectives: Navajo Nation Math Circle Connection

Ratios, Proportions, Unit Conversions, and the Factor-Label Method

Chapter 5 Simplifying Formulas and Solving Equations

Quantum Mechanics: Stranger than we can imagine?

Transcription:

Stat 1001 Winter 1998 Geyer Homework 7 Problem 18.1 20 and 25. Problem 18.2 (a) Average of the box. (1+3+5+7)=4=4. SD of the box. The deviations from the average are,3,,1, 1, 3. The squared deviations are 9, 1, 1, 9. The sum of the squared deviations is 20. The average squared deviation is 20=4 = 5 and the SD is p 5=2:236. Expected value for the sum of draws. 400 4=1;600. SE for the sum of draws. p 400 2:236 = 44:72. Conversion to. 1,500 is 100=44:72 = 2:236 SE below the expected value. That is,,2:236 in. Table look-up. The area we want to look up is the area above 1,500 in original units (,2:236 in ). The normal curve tail area table says the 1500 1600 1700 value of the sum -3-2 -1 0 1 2 3 tail (the unshaded area) is 1.3% (between 1.39 and 1.22). The shaded area is 100, 1:27 = 98:7 percent. 1

(b) For this problem we have to recode the box to count threes. Thus we use the box 0 1 0 0. Average of the box. 1=4. SD of the box. (1, 0) q 1 4 3 4 =0:4330 Expected value for the sum of draws. 400 1 4 = 100. SE for the sum of draws. p 400 0:4330 = 8:660. Conversion to. 90 is 10=8:660 = 1:155 SE below the expected value. That is,,1:155 in. Table look-up. The area we want to look up is the area below 90 in original units (,1:155 in ). The normal curve tail area table says the tail 80 100 120 value of the sum -3-2 -1 0 1 2 3 area is 12.4% (between 12.51 and 11.51, much closer to the former). 2

Continuity correction. To be nicky, we should use \continuity correction." The plot we should actually be thinking of is the one below. 70 75 80 85 90 95 100 value of the sum -3.5-3 -2.5-2 -1.5-1 -0.5 0 \Fewer than 90" means numbers below 90, not including 90. The actual probability histogram for the sum of draws is shown. We want to calculate the area of the shaded rectangles, corresponding to numbers below 90. This is approximated by the area under the normal curve below 89.5 in original units, which is (89:5, 100)=8:660 =,1:212 in. The normal curve tail area table says the tail area is 11.3% (between 11.51 and 10.56, much closer to the former). Exact calculation (we don't know how to do this having skipped Chapter 15, but suce it to say that I asked a computer) shows that the exact answer is 11.2%. So the normal approximation with continuity correction is correct to almost three gures. Without continuity correction we don't even get two correct gures (12.4 rounds to 12 whereas 11.2 rounds to 11). Nevertheless, we count 12.4% as a correct answer, you don't have to use continuity correction on this problem. Problem 18.4 This problem is like Example 1(a) in Section 4. If you don't use \continuity correction" you can't do the problem at all. The question is about the sum of 25 draws from the box 0 1. What is the chance that the sum of draws is exactly 12? (If you are confused by the phrasing of the question \12 heads and 13 tails," you have to realize that this is just the same as saying \12 heads," because if you get 12 heads in 25 tosses, then the other 13 tosses must be tails.) We don't need to calculate the average of the box and the SD of the box. Both are given in Example 5 of Chapter 17 (p. 301). Both are 1=2. Expected value for the sum of draws. 25 1 2 =12:5. SE for the sum of draws. p 25 1 2 =2:5. 3

5 10 15 20 value of the sum -3-2 -1 0 1 2 3 The picture. The sides of the shaded rectangle are 11.5 and 12.5 in the original units. Conversion to. 11.5 converted to is is (11:5, 12:5)=2:5 =,0:4. 12.5 converted to is is 0. Table look-up. Thus we want the area under the normal curve between z =,0:4 and z =0. This is half the area between,0:4 and +0:4. That area, from the normal curve table in the book is 31.08%. Thus the answer is 15.54% The Moral of the Story. Sometimes you must use \continuity" correction. What would you do otherwise? If you say both sides of the rectangle are 12, the width of the rectangle is zero and the area zero. That's completely wrong. If you say one side of the rectangle is 12 and the other 13 so the rectangle has the right width, the answer you get is 15.85%, which has only two correct gures instead of the three you get using the continuity correction properly. (The exact answer, from asking a computer, is 15.50%.) Center the Rectangles over the Numbers In a classifying and counting problem the sum of draws from the box isa count (integer). In drawing the probability histogram, center the rectangles over the integers. The picture for this problem is an example. Continuity Correction When you have centered the rectangles in the probability histogram over the integers, the side of a rectangle is never an integer, it is always halfway between two integers (something point ve). 4

This problem and the nicky solution to Problem 18.2 are examples. The two boxes tell you how to use the normal approximation for probability histograms the Right Way (with a capital R and a capital W). Using what the book calls \keeping track of the edges of the rectangles" and everyone else calls \continuity correction" (never mind why, it's a misleading name, but is traditional). Problem 18.6 We should be comparing the error to the SE for the sum of draws. As in Problem 18.4, the relevant boxis 0 1.We are interested in the sum of one million draws from this box. Also as in Problem 18.4, we don't need to calculate the average of the box and the SD of the box. Both are given in Example 5 of Chapter 17 (p. 301). Both are 1=2. Thus the SE for the sum of draws is p 1; 000; 000 1 2 =1;000 1 2 = 500. The error should be about 500 or so. 95% of the time it will be less than 2 SE, which is 1,000. An error of 2,015 is more than 4 SE. The normal curve tail area table says this happens less than one time in 10,000. Looks like the computer program is buggy. Problem 18.7 Like box (ii). The roll of each die is like one draw from box (ii). The dice are independent. Hence the total number of spots on the dice is like the sum of two draws with replacement from box (ii). Problem 18.12 (a) No. You aren't given enough information to calculate the average and SD of the box. (b) Yes. Knowing the average and the SD of the box and the numberofdraws allows you to calculate the expected value and SE for the sum of draws, and that's enough to use the normal approximation. Problem 18.13 (a) No. There could be anywhere from 0 to 4 3 's in the box. We would need to be told how many. (b) No. The average and SD of the original box are no help. We need to know the average and SD of the box that has 3 's replaced by 1 's and all the other numbers replaced by 0 's (as in any \classifying and counting problem"). We haven't been told that. 5

Problem 18.14 (a) Yes. We need to know the average and SD of the box that has positive numbers replaced by 1 's and negative numbers replaced by 0 's (as in any \classifying and counting problem"). We know there are four positive numbers and six negative ones. So the relevant box is 1 1 1 1 0 0 0 0 0 0 We can calculate the average (0.4) and SD ( p 0:4 0:6 =0:49) of this box, and so forth. (b) This is now moot. We don't need any extra information, we already had enough. Problem 19.2 Yes or no, it depends on what \similar" means. Iwould expect some response bias in the rst survey. If you feed people an answer, some of them will take it, regardless of whether it is correct. See p. 344. The mere order in which choices are given aects the choice. Asking to see what detergent is being used leaves no opportunity for response bias. If you objected to the word \housewives" in the question, you missed the point. The authors of the textbook didn't phrase it well, but surely a real marketing survey would ask whoever does the laundry. Problem 19.4 There is no random sampling here. People chose the blocks. They thought those blocks were \most representative" but people aren't any good at such decisions. That's why random samples are used. This can be called a quota sample, but quota sampling isn't a good method. See the box on p. 339 in the textbook. Problem 19.5 This is a totally bogus way of dealing with nonresponse bias. The 347 households in the new sample are just like the responders in the rst sample. Nothing has been found out about nonresponders. There's no way to tell for sure about which way the nonresponse bias goes. A guess is that the smaller households are more likely to be nonresponders (with fewer people, it's less likely that one is interested in responding). This would make the estimate too high. What nance department should have done is to go back to the 347 nonresponding households in the original survey and see if the could get responses from them on the second try (or third try, etc.) 6

Problem 20.1 Number of tosses Number of heads Percentage of heads EV SE EV SE 100 50 5 50% 5% 2,500 1,250 25 50% 1% 10,000 5,000 50 50% 0.5% 1,000,000 500,000 500 50% 0.05% The number of heads is the number of draws times 0.5 (the average of the box). The percentage of heads is always 50% (the percentage in the box, which is the same thing as the population percentage). The standard errors are given by SE for number = p number of draws (SD of box) (1) SD of box SE for percentage = p 100% number of draws (2) (The SD of the box is also 0.5.) The Moral of the Story. Repeating what is said at the bottom of p. 360 The SE for the sample number goes up like the square root of the sample size. The SE for the sample percentage goes down like the square root of the sample size. Problem 20.3 (a) 50,000. The box models the population not the sample. (b) A zero or a one. This is a \counting and classifying" problem. We are counting gross incomes over $50,000. (c) False. It doesn't even have units of dollars. (d) True. The number of draws in the box model corresponds to the sample size in the real sample. (e) The box model. The box model has 20% 1 's and 80% 0 's. Expected value for the percentage. The expected value for the percentage is always the same as the the population percentage, which is 20%, the percentage of 1 's in the box. 7

SD for the box. bigger number, smaller number s fraction with bigger number, fraction with smaller number =(1,0) p 0:20 0:80 = 0:4 Caution. Don't use 20% and 80% instead of 0.20 and 0.80 here. If you do, your answer will be o by a factor of 100. SE for the percentage. Now we use equation (2). The SE for the percentage is 0:4=p 900 100% = 1:333%. The picture. 16 18 20 22 24 value of the percentage -3-2 -1 0 1 2 3 Conversion to. 19% is (19, 20)=1:333 =,0:75 in standard units. 21% is +0:75. Table look-up. The normal curve table in the textbook says this area is 54.67%. (f) No. We aren't told anything about the fraction of incomes in the population over $75,000. We would need to know that. Problem 20.4 This problem is tricky. Despite being in the chapter which introduces \percentage of draws," this problem is about \sum of draws." The \total gross income of the audited forms" is the sum of all the incomes in the audited sample. So in this problem we use formulas for sums not percentages. (a) 50,000. For the same reason as in Problem 20.3. 8

(b) Now that we are interested in the sum of incomes, the box with incomes written on the tickets is the appropriate box. (c) True. (d) True. For the same reason as in Problem 20.3. (e) The box model. The box model has average $37,000 and SD $20,000 given in the statement of Problem 20.3. Expected value for the sum of draws. 900 $37; 000 = $33; 300; 000. SE for the sum of draws. p 900 $20; 000 = $600; 000. The picture. 31 32 33 34 35 value of the sum (millions of dollars) -3-2 -1 0 1 2 3 Conversion to. 33 million is (33, 33:3)=0:6 =,0:5 in standard units. Table look-up. The normal curve tail area table says the tail (unshaded area) is 30.85%. The opposites rule gives 100, 30:85 = 69:15% for the answer. Problem 20.6 (ii) The sample size for California is 0:001 30; 000; 000 = 30; 000. The sample size for Nevada is 0:001 1; 000; 000 = 1; 000. The larger sample size for California provides more accuracy. 9

Problem 20.11 (a) ::: observed value is 357 but the expected value is 340. expected value for number =(number of draws) (average of box) = (sample size) (fraction in population) The fraction of undergraduates in the population is 17; 000=25; 000 = 17=25. Thus the expected value for the observed number is 500 17=25 = 340. (b) ::: observed value is 71.4% but the expected value is 68%. The expected value for the percentage is the same as the population percentage, 17=25100%. 10