Introduction to Statistics

Size: px

Start display at page:

Download "Introduction to Statistics"

Alannah Bryant
5 years ago
Views:

1 Chapter 1 Introduction to Statistics 1.1 Preliminary Definitions Definition 1.1. Data are observations (such as measurements, genders, survey responses) that have been collected. Definition 1.2. Statistics is a collection of methods for planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on data. Definition 1.3. A Population is the entire collection of individuals or measurements about which information is desired. Definition 1.4. A Sample is a subset of the population that has been selected for study. Definition 1.5. A statistic is a numerical description of a SAMPLE. Definition 1.6. A parameter is a numerical description of a POPULATION. Definition 1.7. Statistical Inference consists of methods of techniques for generalizing from a sample to the population from which the sample is selected. Definition 1.8. Sampling Variability describe the extent to which samples differ from one another. 1

2 2 1.2 Framework of Statistics Population Sample Parameter statistic

3 3 Idea for a Confidence Interval

4 4 Idea for a Hypothesis Test

5 Chapter 2 Probability Remark 2.1. The information regarding probability can be found in Chapter 4 of your textbook. How do we measure likeliness? How do we determine what is considered (un)likely? 1 Definition 2.1. The Probability of an event is 0 Definition 2.2. A significance level α is the largest probability an unlikely event can have. 5

6 6 2.1 Definitions & Examples Definition 2.3. The result of a single trial of a given procedure is called an outcome. Definition 2.4. An event is any collection of results of outcomes of a procedure. Definition 2.5. A simple event is an outcome or an event that cannot be further broken down into simpler components. Definition 2.6. The sample space for a procedure consists of all possible simple events. Example 2.1. A bucket contains some numbered balls. Eventually, one ball will be removed at random. 1. Find the sample space for this procedure. 2. Let A denote the event that the outcome is even. Describe A in terms of simple events. Example 2.2. Two numbered balls are removed individually from a bucket. Replacing each after they are removed. The numbers on the balls are written down. 1. Find the sample space for this procedure. 2. Let B denote the event that the outcome at least one ball is a three. Describe B in terms of simple events. 3. Is the event at least one ball is a three a simple event? 4. If the balls were added together, what would be the sample space?

7 7 2.2 Some Methods for Computing Probabilities of an Event There are three approaches to determining the probability of an event: 1. Subjective Probabilities 2. Relative Frequency Approximation 3. Classical Approach Theorem 2.1. The Law of Large Numbers states that as a procedure is repeated again and again, the relative frequency approximation for the probability of an event tends to approach the actual probability.

8 8 Example men and women were surveyed. They were asked Which do you like better: Pollen or Propolis? The answers are tallied below. Pollen Propolis Total Men Women Total a. What is the probability that a randomly selected survey respondent will be a woman? Pollen Propolis Total Men Women Total b. What is the probability that a randomly selected survey respondent will prefer pollen? Pollen Propolis Total Men Women Total c. If you consider only the female responses, what is the probability that you would randomly select one of the women that prefer pollen? Pollen Propolis Total Men Women Total Example 2.4. A colored ball is removed, at random, from a bucket. What is the probability that the ball will be green?

9 9 Example 2.5. Two fair four-sided dice is rolled. What is the probability that both numbers will be even? Example fair four-sided dice are rolled. What is the probability that all the numbers will be even?

10 Counting Fundamental Counting Rule Given two sequential events, if the first can occur m ways, and the second event can occur n ways, then the number of ways both events can occur in sequence is equal to m n. Example 2.7. An airline has 6 routes from city A to City B, and 9 routes from City B to City C. If you were to take this use this airline, how many routes could you take from City A to City C? Example 2.8. How many ways can a family with 6 members be lined up to take a family portrait?

11 Order or no order? Repeats or not? Definition 2.7. Permutations of items are arrangements in which different sequences of the same items are counted separately. Definition 2.8. Combinations of items are arrangements in which different sequences of the same items are not counted separately. Selecting r of n distinct objects. Unordered Repeats No Repeats nc r Ordered n r np r

12 12 Example 2.9. You have 4 extra tickets for a concert and 7 friends. How many different groups of your friends could accompany you to the concert? Example You have three astronauts, Anna, George, and Michele, on the first Mission to Mars. For the first Marswalk, two of them will be allowed to leave their flying saucer, and walk on the planet; one will have to remain behind. How many different ways can they be assigned a job for their first landing? If they are randomly given their assignment, what is the probability that George will be left on the ship? Example How many five letter words can be made with the letters F, S, H, E. A letter can be used more than once. What is the probability that a five letter word will start with the letter F? Only the letters F, S, H, E can be used. Letters can be repeated. When some Items are Identical to Others - Another Permutation Rule Example How many different ways can the letters in TENNESSEE be arranged? If these letters are randomly arranged, what it the probability that they will spell TENNESSEE?

13 The Addition Rule for Probabilities Definition 2.9. A compound event is any event combining two or more simple events. Notation 2.1. More notation that will be used (A or B) = (A and B) = Formal Addition Rule P (A or B) = P (A) + P (B) P (A and B) Example Suppose the following: P (A) =.9, P (B) =.8, P (A and B) =.77. Find P (A or B). Example In a group of 101 students 40 are juniors, 50 are female, and 22 are female juniors. Find the probability that a student picked from this group at random is either a junior or female.

14 14 Example A family of 6 is going to have their picture taken. The photographer is going to randomly line everyone up. What is the probability that the mother ends up in the first chair or the father ends up in the sixth chair? Example A single card is chosen at random from a standard deck of 52 playing cards. What is the probability of choosing a king or a club? Example Two dice are rolled. The first is a fair 6-sided die. The second is a fair 4-sided die. Once they are rolled, the two numbers on the two disc are used to create a 2 digit number. The number from the six sided die is used to make the 10s digit. The number from the 4-sided die is used to make the ones digit. What it the probability that the resulting number is odd or begins with an even number?

15 15 Definition Events A and B are disjoint( or mutually exclusive) if they cannot occur at the same time. (That is, they do not overlap.) Probability of the Intersection of Two Disjoint Events If events A and B are disjoint, then P (A and B) = Addition Rule for DISJOINT Events If events A and B are disjoint, then P (A or B) = P (A) + P (B) P (A and B) = Example Suppose that A and B are disjoint events such that the following is true: P (A) =.9, P (B) =.06. Find P (A or B). Example In a group of 201 students 70 are freshmen, 41 are sophomores, 30 are junior, 50 are seniors, and 10 are graduate students. Find the probability that a student picked from this group at random is either a freshman or sophomore. Example A family of 6 is going to have their picture taken. The photographer is going to randomly line everyone up. What is the probability that the mother ends up in the first chair or the father ends up in the first chair?

16 16 Example Two dice are rolled. The first is a fair 6-sided die. The second is a fair 4-sided die. Once they are rolled, the two numbers on the two disc are used to create a 2 digit number. The number from the six sided die is used to make the 10s digit. The number from the 4-sided die is used to make the ones digit. What it the probability that the resulting number is odd or ends with a 2? Example A bucket contains some bouncy balls that are colored as well as numbered. The following table indicates the number of each kind of ball in the bucket. Yellow Green Orange Red Blue Brown Purple Total Odd Even Total If a ball is randomly chosen, what is the probability that the ball will be blue even ball or a purple odd ball? 2. If a ball is randomly chosen, what is the probability that the ball will be blue or purple? 3. If a ball is randomly chosen, what is the probability that the ball will be even, or purple?

17 17 Rule for Complimentary Events Example Find the indicated probabilities. 1. Suppose P (A) =.23. Find P (Ā). 2. Suppose P (Ā) =.12, P ( B) =.21, P ( C) =.22. Find P (B). Example Same bucket as used in Example What is the probability that a randomly selected ball is neither brown nor even? Yellow Green Orange Red Blue Brown Purple Total Odd Even Total

18 18 Example Two dice are rolled. The first is a fair 6-sided die. The second is a fair 4-sided die. Once they are rolled, the two numbers on the two disc are used to create a 2 digit number. The number from the six sided die is used to make the 10s digit. The number from the 4-sided die is used to make the ones digit. What it the probability that the resulting number is not 42? Example A single card is chosen at random from a standard deck of 52 playing cards. What is the probability of choosing neither a king nor a club? Example In a group of 700 families, 75 had more than 3 children, 125 had exactly 3 children, 300 had 2 children, and 100 had only a single child. If one family is randomly selected, what is the probability that it will have no children?

19 Conditional Probability & The Multiplication Rule Definition Let A and B be two events. The conditional probability of A given B, P (A B), is the probability that A happens given the information that B occurs. It is the probability of an event with the additional information that some other event has already occurred. Denoted by P (B A). Example A bucket contains some bouncy balls that are colored as well as numbered. The following table indicates the number of each kind of ball in the bucket. The contents of the bucket are separated into two buckets, odds and evens. If we randomly select a single ball from the odd bucket, what is the probability that the ball is red? Yellow Green Orange Red Blue Brown Purple Total Odd Even Total Example Two dice are rolled. The first is a fair 6-sided die. The second is a fair 4-sided die. Once they are rolled, the two numbers on the two disc are used to create a 2 digit number. The number from the six sided die is used to make the 10s digit. The number from the 4-sided die is used to make the ones digit. If a two appeared on the 6-sided die, what it the probability that the resulting number is odd? Example Five cards are dealt from a freshly shuffled deck of cards. Suppose the first four cards are kings, what it the probability that the fifth card will be an ace?

20 20 Definition Two events A and B are independent if the occurrence of one event does not affect the probability of the occurrence of the other event. If A and B are not independent, they are said to be dependent. If two events A and B are independent, then P (A B) = P (A) P (B A) = P (B) P (B and A) = P (A)P (B) Example Given two events A and B. Suppose that P (A B) =.8 and P (A) =.81. Are the events A and B independent? Example Given two independent events A and B. Suppose that P (B) =.8 and P (A) =.42. Find P (B and A). Example An urn contains 2 colored balls: 1 blue & 1 red. If two balls are removed, one at a time, replacing each after it is drawn. What is the probability that the second ball is red, if the first was blue? Example An urn contains 2 colored balls: 1 blue & 1 red. If two balls are removed, one at a time, without replacing each after it is drawn. What is the probability that the second ball is red, if the first was blue? Remark 2.2. The method used for selecting, or sampling items, is very important and can determine whether two events are independent or dependent. Selections (Sampling) without replacement: Dependent events. Selections (Sampling) with replacement: Independent events.

21 21 Formal Multiplication Rule P (A and B) = P (A) P (B A) Example A bucket contains several colored bouncy balls, red,yellow and blue. One at a time, two balls are removed from the bucket. After the first ball is removed, it will not be replaced. What is the probability that the first ball is red and the second bouncy ball is green. Example A bucket contains several colored bouncy balls, red,yellow and blue. One at a time, two balls are removed from the bucket. After the first ball is removed, it is replaced. What is the probability that the first ball is red and the second bouncy ball is green. Example If two cards are dealt from a deck without replacing them, what is the probability that an ace will be dealt first and a two will be dealt second? Example If two cards are dealt from a deck with replacement, what is the probability that an ace will be dealt first and a two will be dealt second?

22 More Conditional Probability Definition Let A and B be two events. The conditional probability of A given B, P (A B), is the probability that A happens given the information that B occurs. It is the probability of an event with the additional information that some other event has already occurred. P (B A) = P (A and B) P (A) Example A statistics professor tosses two coins that cannot be seen by any of the students. One student asks: Did one of the coins turn up heads? Suppose the professor answered yes, find the probability that both coins turned up heads. Example An urn contains 3 colored balls: 2 blue & 1 red. If two balls are removed, one at a time, without replacing each after it is drawn. What is the probability that the second ball is red, if the first was blue?

23 23 Example A student answers a multiple choice examination question that has 4 possible answers. Suppose that the probability that the student knows the answer to the question is 0.80 and the probability that the student guesses is Also, If the student guesses, the probability of a correct guess is If the question is answered correctly, what is the probability that the student really knew the correct answer?

24 Chapter 3 Probability & Random Variables Remark 3.1. This is chapter 5 in the textbook. Our goal is to compute probabilities for Random Procedures/Phenomenon whose outcomes are numbers Definition 3.1. A random variable is a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure. A random variable is a variable whose value is a numerical outcome of a random procedure/phenomenon. Example 3.1. Examples of Random Variables. The weight of a randomly selected package taken from the post office. The amount of time it takes to walk from the first floor to the fourth floor. The temperature of a randomly selected popsicle. The amount of money you spend on your next tank of gas. The number of lunches served in the cafeteria on a given day. The color of a ball pulled out of a bucket. 24

25 25 There are two ways to assign probabilities to a random variable. These provide two types of random variables: Definition 3.2. A Continuous Random Variable has infinitely many values, and the collection of values is not countable. Definition 3.3. A Discrete Random Variable has a collection of possible values that is finite or countable. Random variables will usually ( but not always ) be denoted by capital letters from the end of the alphabet. When a random variable describes a random phenomenon, the sample space S lists the possible values of the random variable. Definition 3.4. A Probability Distribution is a description that gives the probability for each possible value of a random variable. It is often expressed as a table, a formula, or a graph. Examples of Probability Distributions Example 3.2. A bucket contains 4 green, 3 brown and 3 purple bouncy balls. A ball is randomly selected from the bucket. We check the color of the ball. (We could say that we count the number of green balls observed.)

26 26 Example 3.3. A bucket contains 4 green, 3 brown and 3 purple bouncy balls. One at a time, four balls are randomly removed, and replaced, from the bucket. We count the number of green balls observed. Definition 3.5. A Binomial Probability Distribution results from a procedure that meets all the following requirements: a.) The procedure has a fixed number of trials. A trial is a single observation. b.) The trials must be independent. The outcome of any one trial has no affect on the probabilities in the other trials. c.) Each trial must have all outcomes classified into two categories (commonly referred to as success and failure). d.) The probability of a success remains the same for all trials. If X has the Binomial distribution B(n, p) with n observations and probability p of success on each experiment, or observation, the possible values of X are 0, 1, 2,..., n. If k is any one of these values, the binomial probability is P (X = k) = n C k p k (1 p) n k. The mean and standard deviation of a binomial random variable X is µ = np σ = np(1 p)

27 27 Example 3.4. A coin is tossed four times. 1. What is the probability distribution of the discrete random variable X that counts the number of heads? 2. Find P (X > 1). 3. Find P (X 1). 4. Find P (X 1). Remark 3.2. A Binomial Probability Distribution results from a procedure that meets all the following requirements: a.) The procedure has a fixed number of trials. A trial is a single observation. b.) The trials must be independent. The outcome of any one trial has no affect on the probabilities in the other trials. c.) Each trial must have all outcomes classified into two categories (commonly referred to as success and failure). d.) The probability of a success remains the same for all trials.

28 28 Definition 3.6. If X has the Poisson distribution, P oisson(µ), with mean number of occurrences equal to µ, the possible values of X are 0, 1, 2, 3,.... If k is any one of these values, the Poisson probability is P (x) = µx e µ. x! The mean is µ. The standard deviation of a Poisson random variable X is σ = µ. Remark 3.3. A Poisson Probability Distribution results from a procedure that meets all the following requirements: a.) The random variable counts the number of occurrences of an event over a time interval; b.) The occurrences must be random, independent, and uniformly distributed over the time interval. Example 3.5. Assume that the mean number of aircraft accidents in the United States is 8.5 per month. Use the Poisson distribution to find the probability that in a month there will be a.) 6 aircraft accidents. b.) at least 5 aircraft accidents., c.) no more than 7 aircraft accidents. d.) Over a one year period, how many aircraft accidents would you expect there to be?

29 29 PDF vs CDF More Examples of Random Variables - Continuous The probability distribution of X is described by a density curve (a graph). The probability of any event is the area under the density curve and above the x axis, and between the values of X that make up the event. The total area under a density curve is equal to 1, and a density curve never goes below the x-axis. Every individual outcome for a continuous random variable has probability zero.

30 30 Definition 3.7. A continuous random variable has a uniform distribution if its values are spread evenly over the range of possible values. The density curve (graph) of a uniformly distributed random variable is a rectangle. Example 3.6. The amount of time a particular subway train will wait at a station is uniformly distributed between 5 and 10 minutes. Find the probability that the train will wait 1. exactly 6 minutes. 2. at most 6 minutes. 3. at least 7 minutes

31 31 Definition 3.8. A continuous random variable X has a normal distribution with mean µ and standard deviation σ if its density curve is given by y = 1 2πσ e 1 2( x µ σ ) 2. Normal Distribution Density µ µ 3σ µ 2σ µ 1σ x value µ+1σ µ+2σ µ+3σ Normal Distribution Density µ µ 3σ µ 2σ µ 1σ x value µ+1σ µ+2σ µ+3σ Normal Distribution Density µ µ 3σ µ 2σ µ 1σ x value µ+1σ µ+2σ µ+3σ Normal Distribution Density µ µ 3σ µ 2σ µ 1σ x value µ+1σ µ+2σ µ+3σ The probability distribution of X is described by a density curve (a graph). The probability of any event is the area under the density curve and above the x axis, and between the values of X that make up the event. The total area under a density curve is equal to 1, and a density curve never goes below the x-axis. Every individual outcome for a continuous random variable has probability zero.

32 32 Example 3.7. The heights of fully grown white oak trees are normally distributed with a mean height of 90 feet and standard deviation of 3.5 feet. 1. What is the probability that a randomly selected fully grown white oak tree is less than 87 feet tall? 2. What is the probability that a randomly selected fully grown white oak tree is greater than 94 feet tall? Example 3.8. The ACT is an exam used by colleges and universities to evaluate undergraduate applicants. The test scores are normally distributed. In a recent year, the mean test score was 20.1 and the standard deviation was What is the probability that a randomly selected ACT score is between 16 and 24? 2. What is the probability that a randomly selected ACT score is greater then 22.5?

33 t distributions X Definition 3.9. A continuous random variable X has a t-distribution with k degrees of freedom, if its density curve is given by y = Γ ( ) k+1 2 ( kπγ k ) 2 ) k+1 (1 + x2 2. k t distributions X t distributions X t distributions X t distributions X The probability distribution of X is described by a density curve (a graph). The probability of any event is the area under the density curve and above the x axis, and between the values of X that make up the event. The total area under a density curve is equal to 1, and a density curve never goes below the x-axis. Every individual outcome for a continuous random variable has probability zero.

34 chi square distributions X Definition A continuous random variable X has a χ 2 -distribution with k degrees of freedom, if its density curve is given by y = 1 2 k 2 Γ ( )x k 2 1 e x 2 k. 2 chi square distributions X chi square distributions X chi square distributions X The probability distribution of X is described by a density curve (a graph). The probability of any event is the area under the density curve and above the x axis, and between the values of X that make up the event. The total area under a density curve is equal to 1, and a density curve never goes below the x-axis. Every individual outcome for a continuous random variable has probability zero.

35 Measuring the Center of a Distribution p = 0.1 p = 0.25 p = 0.5 p = x x x x Definition The mean of a probability distribution, or the mean of a random variable, is a number that indicates the center, or location, of the random variables distribution. If X is a discrete random variable whose distribution is Possible Value of X x 1 x 2... x k Probability P (x 1 ) P (x 2 )... P (x k ) then mean of X is computed as follows: µ X = x 1 P (x 1 ) + x 2 P (x 2 ) + + x k P (x k ) The mean for a random variable X is also called the EXPECTED VALUE OF X. If you repeat a random procedure an extreme number of times, and average the observed random variable will be very close to the mean of the random variable. The mean is what you expect to see on average. If a random variable X has a Binomial Distribution with n trials and probability of success p, then µ X = np. You will not need to compute the mean for a continuous random variable X

36 Measuring the Spread of a Distribution Definition The standard deviation of a probability distribution, or the standard deviation of a random variable, is a number that indicates the spread, or dispersion, of the random variables distribution. If X is a discrete random variable with mean µ, and distribution Possible Value of X x 1 x 2... x k Probability P (x 1 ) P (x 2 )... P (x k ) then the standard deviation of X is σ = (x 1 µ) 2 P (x 1 ) + (x 2 µ) 2 P (x 2 ) + + (x k µ) 2 P (x k ) Example 3.9. Determine the mean, standard deviation, and variance for the following distribution: X P (X) If a random variable X has a Binomial Distribution with n trials and probability of success p, then σ X = np(1 p). You will not need to compute the mean for a continuous random variable. Variance X The Variance of a random variable X is its standard deviation squared. The Variance of a random variable is another measure of the spread of a random variables distribution.

37 Percentiles & Critical Values Percentiles Definition The 100α th -percentile is a number, P 100α, that divides the probability distribution of a random variable X into two parts where P (X P α ) α and P (X P α ) 1 α. The 100α th -percentile is a number, P 100α, that separates the bottom 100α% of a distribution from the top 100(1 α)%. Normal Chi Square t distribution InvNorm(α, µ, σ) InvT (α, df) MATH Solver... MATH Solver... 0 = α χ 2 cdf(0, X, df) 0 = α tcdf( 2 99, X, df) ENTER ALPHA ENTER ENTER ALPHA ENTER

38 38 Normal Chi Square t distribution InvNorm(α, µ, σ) InvT (α, df) MATH Solver... MATH Solver... 0 = α χ 2 cdf(0, X, df) 0 = α tcdf( 2 99, X, df) ENTER ALPHA ENTER ENTER ALPHA ENTER Example Find P 99 for a t distributed random variable with 5 degrees of freedom. Example Find P 95 for a χ 2 -square distributed random variable with 3 degrees of freedom. Example Find P 90 for a normally distributed random variable with µ = 5, and σ = 3. Example In a large section of a statistics class, the points for the final exam are normally distributed with a mean of 72 and a standard deviation of 9. Find the lowest score on the final exam that would qualify a student for an A, if an A should include the top 10% of the class. Example The annual per capita utilization of apples (in pounds) in the United States can be approximated by a normal distribution with µ = 17.4 lb. and σ = 4 lb. What annual per capita utilization of apples represents the 10th percentile?

39 39 Critical Values Definition A critical value is a number that is used to separate unusual ( unlikely ) values for a random variable from those values that are expected ( likely ) to occur. The placement of a critical value will depend on: the distribution of the random variable; the significance level α used to define what it means for an event to be unlikely. Some questions will require the determination of two critical values. ( Usual, Expected, Common, Likely ) values will generally be considered values close to the mean. ( Unusual, Unexpected, Surprising, Unlikely ) values will generally be considered values far to the mean.

40 40 Critical Values for Specific Distributions Notation 3.1. z α, or z, denotes a critical value for a Standard Normal Random variable with an area, or probability, of α to its right. Example Find z.05 standard normal Notation 3.2. t α,k, or t, denotes a critical value for a t-random Variable, with k degrees of freedom, with an area, or probability, of α to its right. Example Find t.05,3 t distribution Notation 3.3. χ 2 α,k denotes a critical value for a χ2 -Random Variable, with k degrees of freedom, with an area, or probability, of α to its right. Example Find χ 2.05,4 chi square The critical values given above define ( Unusual, Unexpected, Surprising, Unlikely ) values to be numbers that are far from zero. Later, we will define these values to be the distance between what we expect to happen, and what actually happens. This translates into the idea that unlikely values are those that are a great distance (relatively) from what we expect.

41 41 Tail Events & Tail Probabilities Definition A one-tail event for a random variable X is an event such as {X t}, {X t}, where t is any number. Definition A two-tail event for a random variable X is an event such as {X > t or X < r}, where r < t are any numbers. Definition A tail probability is the probability of a ( two ) tail event. Percentiles and Critical Values are defined in terms of tail events. If a tail probability is smaller than a given significance level, α, then the tail event will be considered unlikely. If a tail probability is smaller than a given significance level, α, then any outcome within that tail event will be considered ( Unusual, Unexpected, Surprising, Unlikely ).

42 42 Depending upon the situation, and significance level α, we may define ( Unusual, Unexpected, Surprising, Unlikely ) values to be values that are Far from the mean AND too small µ Far from the mean AND too big µ Far from the mean AND either too big or too small µ

43 Chapter 4 Samples Population Sample Parameter statistic 43

44 44 Remark 4.1. You should read Chapter 1 from your textbook. We will cover only the information necessary for the procedures that will be introduced later. 4.1 Goals: Describe a population s unknown distribution; Describe a population s unknown parameters; Describe the nature of the relationship between populations. 4.2 Collecting Data Definition 4.1. SAMPLE: 1. VERB To sample a population is the act of selecting individuals, items, object, or members of a population. 2. NOUN A Sample is the subset of the population that has been selected. Definition 4.2. A simple random sample of n subjects is selected in such a way that every possible sample of the same size n has the same probability of being selected. All of the procedures that will be discussed later will use a simple random sample. A simple random sample is a selection of n subjects without replacement. This means we have dependent selections from a finite population. If the sample size is no more than 5% of the overall population, we will treat the selections as being independent. We will think of our samples, as selections make with replacement. For examples in class, we will take samples ( make selections ) with replacement.

45 45 Other Sample Types Definition 4.3. In systematic sample, we select some starting point and then select every k th element in a population. Definition 4.4. In stratified sample, we subdivide the population into at least two different subgroups ( or strata ) so that subjects within the same subgroup share the same characteristics. Then we draw a sample from each subgroup (or stratum). Definition 4.5. In cluster sampling, we first divide the population area into sections ( or clusters ). Then we randomly select some of those clusters and choose all the members from those selected clusters. Definition 4.6. With convenience sample, we simply use results that are very easy to get. Definition 4.7. In an observational study, we observe and measure specific characteristics, but do not attempt to modify the subjects being studied. Definition 4.8. In an experiment, we apply some treatment and then proceed to observe its effects on the subjects. ( Subjects in experiments are called experimental units.) Type of Observational Studies Definition 4.9. In a cross-sectional study, data are observed measured, and collected at one point in time. Definition In a retrospective study, data are collected from the past by going back in time (through examination of records, interviews, and so on. Definition In a prospective study, data are collected in the future from groups sharing common factors.

46 Describing Populations using Graphs of Sample Data Graphs of Sample ( Quantitative ) data can be used to make guesses about the distribution of a population. We will look at the graphs to determine whether they appear to be : Normal Uniform Symmetric Skewed Definition A ( relative ) frequency histogram is a graph consisting of bars of equal width drawn adjacent to each other ( unless there are gaps in the data). The horizontal scale represents classes of quantitative data value and the vertical represents ( relative )frequencies. The heights of the bars correspond to the ( relative ) frequency values.

47 47 Remark 4.2. Having a guess about the SHAPE of a distribution, allows you make a guess about how to compute probabilities about future samples from the same type of distribution. If we do not know the SHAPE of a distribution, we CAN NOT make any GOOD guesses about the probability of an event. Assessing Normality with a Small Data Set With a small data set, the shape of a distribution may not be very clear. It is very important to us to be able to identify populations with Normal Distributions. A normal quantile plot can assist us with this. Normal Distribution Non-Normal Distribution

48 48 Stemplot A Stemplot (Stem & Leaf plot) is a quick way to look at the SHAPE of a distribution, if your working by hand, and have a relatively small data set. Stem Leaf Other Types of Graphics Definition A scatterplot is a plot of paired (x, y) quantitative data with a horizontal x-axis and vertical y-axis.

49 49 Definition A time-series graph is a graph of times-series data, which are quantitative data that have been collected over a period of time. Definition A Pareto chart is a bar graph for categorical data, with the bars arranged in descending order according to frequencies. Definition A Pie Chart is a graph that depicts categorical data as slices of a circle, in which each slice is proportional to the frequency count for the category.

50 Estimating Population Parameters using Sample Data With a probability distribution for a random variable, defined several numbers that could be used to describe the characteristics of the distribution. Center Mean Spread Standard Deviation Proportion of Successes Percentiles If we have a population, but don t know its distribution, we probably don t know some of these parameters. We will need a method to estimate these parameters, based on samples that we take. Remark 4.3. Not every parameter is interesting for every population.

51 Estimating a Population Mean Definition The sample mean is an estimate of the mean of a probability distribution. It can be found by adding all the sample data values together, and dividing by the sample size. x = x 1 + x x n n Example 4.1. Find the mean of the following sample values: It is a statistic. It is one possible measure of the center of a SAMPLE. It is an estimate of a center of a probability distribution. Its value will change depending upon the sample taken. one extreme value can change the value of the mean substantially. Sample means drawn from the same population tend to vary less than other measures of center.

52 52 Estimating the SAMPLE MEAN from a Frequency Distribution # Frequency N 106 Estimating the SAMPLE MEAN from a Relative Frequency Distribution # Frequency

53 Estimating a Population Standard Deviation Definition The sample standard deviation is an estimate of the standard deviation of a probability distribution. It is denoted by s and is a measure of how much the sample data deviates away from the sample mean x. s = (x x) 2 n 1 Example 4.2. Find the sample standard deviation of the following sample values: Facts about the sample standard deviation s 0 s = 0 only if all if the data values are the same. s will increase greatly if only one additional data value is added that looks very different from the others. The units for s are the same as the units on the original data. s 2 = the sample variance is another measure of variation. It is the square of the sample standard deviation.

54 54 Estimating the STANDARD DEVIATION from a Dataset # Frequency N 106 Definition The range of a data set is the measure of spread found by subtracting the smallest data value from the largest data value. Range Rule of Thumb σ Range Estimating a Proportion of Successes Definition The sample proportion is an estimate of the probability of a success p for some random procedure. It is denoted by ˆp. It is also called a sample proportion. ˆp = # of successes n Example 4.3. Find the sample proportion for the following samples:

55 Estimating Percentiles Definition The 100α-Percentile of a dataset, P 100α, is a number that breaks the ordered dataset into two groups with about 100α% of the dataset less than, or equal to, P 100α and about 100(1 α)% of the dataset greater than, or equal to, P 100α. Finding the Percentile of a Data Value Percentile of x = # of data values < x n Example 4.4. Find the percentile of 18 for the following data: 100 (Round up) 2, 3, 4, 6, 7, 7, 8, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78 Converting a Percentile to a Data Value L = ( ) k 100 n Example 4.5. Find the value of the 20 th percentile, P 20, for the following data: 2, 3, 4, 6, 7, 7, 8, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78 Example 4.6. Find the value of the 33 rd percentile, P 33, for the following data: 2, 3, 4, 6, 7, 7, 8, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78

56 Boxplot - Using Sample Percentiles Definition For a set of data, the 5-number summary consists of these five values: Minimum, Q 1, Q 2, Q 3, Maximum Example 4.7. Give the 5-number summary for the following data: 2, 3, 4, 6, 7, 7, 8, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78 Definition A boxplot is a graph of a data set that consists of a number line extending from the minimum to the maximum data value, and a box drawn at the first, second and third quartiles. Example 4.8. Construct a boxplot for the following data: 2, 3, 4, 6, 7, 7, 8, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78

57 IQR Guideline for outliers It is always important to look for data values that don t apparently fit with the rest. Potential outliers can be identifies as those data values that are less than Q IQR. greater than Q IQR. Example 4.9. Identify any potential outliers for the following data: 2, 3, 4, 6, 7, 7, 7, 8, 9, 10, 13, 13, 14, 16, 18, 22, 22, 34, 56, 78 This rule helps identify values that are far away from the central 50% of the data values Relative Distance From the Center Definition A z-score or standardized value is the number of standard deviations that a given value x is above or below the mean. A z-score is calculated as follows:

58 58 Facts about z-scores A z-score allows a comparison of distances between two distributions that are spread out in different manners. In many cases, a z-score will represent the relative distance between an observation and a distributions expected value. Large z-scores will represent observations that are far to what is expected. These observations would be considered ( Unusual, Unexpected, Surprising, Unlikely ). Small z-scores will represent observations that are close to what we expect. These observations would be considered ( Usual, Expected, Common, Likely ).

59 59 Example Two statistics classes take an exam. The distribution of the test scores looked relatively normal. Class A has a mean of 72 and a standard deviation of 3. Class B had a mean of 83 and a standard deviation of 6. Michele is in Class A. She received a score of 81. Elaine is in Class B. She received a 91. Elaine obviously has the higher overall score, but who did better with respect to their class? Does either one of them have an unusually high score compared to their class?

60 Probability distribution of a z-score The observation used in the computation of a z-score are generally the outcome of some random procedure. The observation represents the outcome of some random variable. If the probability distribution of the observation has a Normal distribution, then the z-score is a random variable, has a standard normal distribution. If X Normal(µ X, σ X ) then z = X µ X σ X Normal(0, 1) We can use this idea to make estimates about the probabilities of future events, or about proportions of a dataset. Example A sample was taken and the following histogram was made. Estimate the proportion of the data that was within 1 standard deviations of the mean. Which data values appear to be within 1 standard deviations of the mean?

61 61 Example A sample was taken and the following histogram was made. Estimate the proportion of the data that was within 1 standard deviations of the mean. Which data values appear to be within one standard deviations of the mean? Example A sample was taken and the following histogram was made. Estimate the proportion of the data that was within 2 standard deviations of the mean. Which data values appear to be within 2 standard deviations of the mean?

62 62 Example A sample was taken and the following histogram was made. Estimate the proportion of the data that was within 3 standard deviations of the mean. Which data values appear to be within 3 standard deviations of the mean? Empirical Rule: Normal Distribution Density µ 3σ µ 2σ µ 1σ µ x value µ+1σ µ+2σ µ+3σ

63 Sampling Distributions Definition The sampling distribution of a statistic is the distribution of that statistic based on a fixed sample size. Recall. The following statistics are random variables: Sample Mean x Sample Proportion ˆp Sample Standard Deviation s Remark 4.4. Many other statistics exist Central Limit Theorem Theorem 4.1. Central Limit Theorem Suppose that a random variable X has a mean µ X and a standard deviation σ X <, then the (sampling) distribution ( based on a simple random sample of size n ) of x will be: Normally distributed with mean µ X and standard deviation σ/ n, if X has a normal distribution. Approximately Normally distributed with mean µ X and standard deviation σ X / n, if the n > 30 and the distribution of X is not heavily skewed. x Normal ( ) σ µ X, n

64 64 Example The height of adult females is normally distributed with a mean of cm and a standard deviation of 8.6 cm. 1. What is the probability that a randomly selected female will be taller than 210 cm? 2. What is the probability that the average height of 25 randomly selected females will be taller than 210 cm? 3. (α =.01) What heights of females would be considered unusually tall? 4. (α =.01) If 25 women are randomly selected, what would be considered an unusually high average height?

65 65 Example Suppose that the amount of time that you will wait for a bus, at a particular bus stop, has a mean of 10 minutes with a standard deviation of 1 minute? 1. What is the probability that on a randomly selected day, you will wait longer than 12 minutes? 2. What is the probability that over 31 randomly selected days you will wait longer than 12 minutes on average? 3. (α =.05) What would be considered an unusually long wait time? 4. (α =.05) Over the course of 31 randomly selected days, what would be considered an unusually long average wait?

66 66 Corollary 4.2. If a population can be split into two disjoint groups, success and failure, and the proportion of success is equal to p and a sample of size n is taken, where np 5 and n(1 p) 5 then ( ) p(1 p) ˆp Normal p, n Example Seventy percent of a town is republican. A random sample of 100 residents will be taken. What is the probability more than 71% of those sampled will be republicans? Example A coin is flipped 25 times, what is the probability that more than 60% of the flips will be tails?

67 Chapter 5 Inference: Confidence Intervals Idea for a Confidence Interval

68 Confidence Intervals for a Single Population Definition 5.1. A Confidence Level 100(1 α)% indicates that there is a 1 α probability that a random procedure produced an acceptable result. Definition 5.2. An Interval Estimate is a range of numbers, determined by following a random procedure, used to estimate an unknown population parameter. Definition 5.3. A 100(1 α)% Confidence Interval is an Interval Estimate produced by following a procedure that correctly estimates an unknown population parameter at least 100(1 α)% of the time, i.e. the procedure has a 100(1 α)% Confidence Level.

69 69 General Procedure for Constructing a Confidence Interval for a Mean or Proportion 1. Decide how confident you want to be in your interval estimate. 2. Decide how precise you want your estimate to be. 3. Using Step 1 and Step 2, determine the necessary sample size n. 4. If necessary, revisit Step 1 and Step 2, if the sample size determined in Step 3 is too large to manage. 5. Take a sample of at least size n. 6. Compute x or ˆp. 7. Compute your margin of error E. 8. Construct your Confidence Interval. (Estimate Margin of Error, Estimate + Margin of Error) 9. State with 100(1 α)% Confidence that the unknown parameter is captured by the confidence interval.

70 Confidence Interval for a Population Mean One possible way to produce a confidence interval for a mean. However, it is unrealistic. It assumes that we know a population standard deviation x z α 2 σ n < µ < x + z α 2 σ n z α 2 z = x µ σ/ n 0 z α 2

71 71 Real Life We don t know the distribution. In real life, we don t know σ. We estimate σ with s. We estimate the z-score with a t-score: t = x µ s/ n t = x µ s/ n t α 0 t α (1 α)% Confidence Interval for µ x t α 2 s n < µ < x + t α 2 s n

72 Confidence Interval for a Population Proportion In a similar manner to the mean, we can make an estimate for a population proportion. p(1 p) p(1 p) ˆp z α 2 n < p < ˆp + z α 2 n z α 2 z = ˆp p p(1 p) n 0 z α 2 We ended with a method for estimating the unknown population proportion p. This has the problem that we need to know the population proportion in order to estimate the population proportion. 100(1 α)% Confidence Interval for p ˆp(1 ˆp) ˆp(1 ˆp) ˆp z α 2 n < p < ˆp + z α 2 n

73 Examples Example 5.1. Twelve leaves were randomly selected from the ground below a single tree and their length (cm) was measured. Use the following information to estimate the mean length of all leaves found under this tree. (95% Confidence) x = s = Normal Q Q Plot Histogram of Data Sample Quantiles Frequency Theoretical Quantiles Data

74 74 Example 5.2. A survey of 17 randomly selected UTM students was conducted. (Not really) They were each asked if they had ever seen an episode of The Walking Dead. Their responses are recorded below. A 1 indicates that they said yes. A 0 indicates that they said no. Estimate with 99% Confidence the true proportion of UTM students that have seen an episode of The Walking Dead

75 Precision A short Confidence Interval gives a more precise estimate for the unknown population parameter. Precision is controlled by three things: The desired and acceptable precision The Confidence Level The Sample Size Example 5.3. A moving company is asked to move 10,000 identical blocks. The moving company wants to know how much each box weighs in order to determine what equipment is needed to move the blocks. The owner of the blocks knows that they all weigh about the same amount. Which would be a more useful guess? Between 2 and 300 pounds; Between 30 and 40 pounds.

76 76 Sample Size for Estimating a Population Mean ( zα/2 σ ) 2 n = ( round up ) E where σ is the known population standard deviation, an estimate of the population standard deviation taken from a previous study, estimated using the range rule of thumb, Sample Size for Estimating a Population Proportion When an estimate of p is known: n = ˆp(1 ˆp) ( zα/2 E ) 2 ( round up ) When an estimate of p is unknown: n = 0.25 ( zα/2 E ) 2 ( round up )

77 77 Example 5.4. You want to estimate the mean SAT score of all college applicants. Possible SAT scores range from 600 to How many scores must be sampled if you would like to estimate the population mean score to within 100 points with 98% confidence? Example 5.5. Find the sample size needed to estimate the percentage of Republicans among registered voters in California to within 3 percentage points with 90% confidence. Example 5.6. A prior Pew Research Center report suggests that 15% of adults have consulted fortune tellers. Determine the sample size necessary to estimate the percentage of adults that consult fortune tellers within 3 percentage points with 98% confidence.

Elementary Statistics

Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q: