University of Jordan Fall 2009/2010 Department of Mathematics

handouts Part 1 (Chapter 1 - Chapter 5) University of Jordan Fall 009/010 Department of Mathematics Chapter 1 Introduction to Introduction; Some Basic Concepts Statistics is a science related to making decisions in the face of uncertainty, it comprises the following 1) Descriptive statistics : Concerned with the collection, organization, summarization and analysis of a body of data ) Inferential statistics : Concerned with drawing inferences about a large body of data (called a population) through examining a part of that body (called a sample) The performance of statistical activities is motivated by the need to answer a question about a certain population. The usual setup of such activities starts with picking up a sample from the population that is similar to that population in the sense that it has all the characteristics and properties of the population (such a sample is said to be an unbiased sample), then to collect information from the sample and use it to answer the question about the population. If the question (hence the data) is related to a medical, biological, or nutritive problem then we use the term biostatistics to distinguish this particular kind of statistical tools. Now we introduce some of the vocabulary and concepts that are widely used in any statistics course. Random Variable: the information or data collected from the subjects can not be exactly predicted in advance, they are referred to as random variables. Random variables are two kinds : Qualitative Variables : They divide the subjects into groups or categories, the value of a qualitative variable can not be measured or counted, for example the birth place, gender, or marital status of an individual. Qualitative random variables are either nomimnal or ordinal. The possible values of a nominal random variable do not have a natural order. For example: gender, marital status, nationality.. The possible values of an ordinal random variable

Page can be ordered naturally. For example: rank, letter grade, degree of improvement such as low, weak, good, very good and excellent. Quantitative Variables : The value of a quantitative variable can be measured or counted. We distinguish between two kinds of quantitative variables: 1. Discrete Variables: if the value of the variable can be counted then it is called a discrete random variable, an example of a discrete random variable is the number of admissions to a general hospital or the number of family members of an individual. Discrete random variables are characterized by gaps or interruptions in the values they assume.. Continuous Variables: if the value of the variable can be measured then it is called a continuous random variable, an example of a continuous random variable is the period of treatment of a tuberculoses patient. A continuous random variable can assume any value within a specified relevant interval of values. Sources of Data: The information about the subjects are usually collected from one or more of the following sources 1. Routinely kept records or archives: for example the medical history of a patient.. Surveys: if the data needed is not available in the kept records then it logical to think of a survey, for example information about whether the patient received a good treatment or not is not usually kept in the hospital records but can be surveyed. 3. Experiments: Frequently the data needed to answer a question are available only as the result of an experiment. Different strategies of motivation may be tried by a pediatrician or a dentist with different children to know the best strategy for maximizing children compliance. 4. External Sources: the data needed to answer a question may already exist in the form of a published report. International organizations like WHO or health ministries usually publish reports that make a good source of data that can be benefited from. Page of 41

Page 3 The Simple Random Sample (SRS) If a sample of size n is drawn from a population of size N in such a way that every possible sample of size n has the same chance of being selected then the sample is called a simple random sample. One method of selecting a simple random sample is a method which is uses random number generators or random number tables. The procedure of that method is the following: 1. Get a list of all subjects in the population. Obtain random numbers from a random number generator or a table 3. Select the subjects whose numbers in the list match with the obtained random numbers. Note: The above method is ideal but it is practically inapplicable to some data, in particular it is difficult to implement it when we need to draw a sample from a relatively huge population. Reading Assignment: Chapter 1 (1.1,1.,1.4) in W.W.Daniel. Chapter Descriptive Statistics Introduction In this chapter we learn several techniques for organizing and presenting data so that we may easily determine what information they contain. The Ordered Array An ordered array is a listing of the values of a collection of data in order of magnitude from the smallest value to the largest value. An ordered array enables one to determine quickly the value of the smallest measurement and the value of the largest measurement. Page 33 of 41

Page 4 Example: The following data are the ages of 30 people, rounded to the nearest year, who have been discharged from a general hospital last Friday 51 70 79 75 55 5 38 74 54 7 37 15 56 17 77 43 16 15 7 9 5 30 4 46 47 46 38 81 49 45 In order to put the above data in an ordered array we just list the measurements from the smallest to the largest 15 15 16 17 4 5 5 30 37 38 38 43 45 46 46 47 49 51 54 55 56 70 7 7 74 75 77 79 81 9 Frequency tables without classes: Such tables can be used to organize all types of data. Example: the following table shows letter grades of 150 students. X (letter grade) Frequency (number of students) F 1 D 15 D+ 0 C 35 C+ 30 B 18 B+ 1 A 8 Grouped Data Frequency tables with classes To group a set of observations we select a set of contiguous, non-overlapping intervals such that each observation belongs to exactly one interval. These intervals are called class intervals. Class intervals need not have the same width. All class intervals are listed in a table which is referred to as a frequency table. A typical frequency table consists of the following Class intervals: a column in which all class intervals are listed Page 44 of 41

Page 5 Midpoints : a column in which the midpoints of the class intervals are computed. The midpoints of a class interval equals (left side + right side)/ Frequency : the frequency of a class interval is the number of observations that belong to the class interval. Cumulative Frequency: the cumulative frequency of a class interval is the number of observations that are less than or equal the right-hand side of that class interval. Relative Frequency : the relative frequency of a class interval equals (the frequency of the class interval / total frequency) Cumulative Relative Frequency: It equals (cumulative frequency / total frequency) A natural question is how many class intervals should be included in a frequency table? A rule of thumb states that the number of class intervals k should be between 5 and 15. We may use the following rule given by Sturges as a guide for computing k: The number of class intervals is the closest integer k to 1 3.3 log10 ( n ) where n is the total number of observations. The number of class intervals specified by the rule can be increased or decreased for more convenience or better presentation. After having decided about the number of class intervals we decide about the class widths. If we decide to give all classes the same width then we compute the class width using the formula (largest value smallest value ) / k rounded to the nearest number from above with the same accuracy unit. Example: Put the data mentioned in the previous example in a frequency table. We start with computing the number of classes. n 30, 1 3.3 log10 30 5.906. Thus we should have 6 class intervals. To obtain the class width we compute 9 15 1.333. Since the observations are integers, 6 we round 1.333 to 13. Thus the class width is 13. Now we are ready to construct the first class interval which has the least observation, namely 15, as a left-hand side and (left-hand side + the class width one accuracy unit) as a righthand side, the second class's left-hand side is the first class's right-hand side + one accuracy unit. The right-hand side of each class is the left-hand side of the class + the class width one accuracy unit. We construct the other class intervals similarly. Page 55 of 41

Page 6 Class intervals Midpoint Frequency Cumulative Relative Cumulative Frequency Frequency Relative Frequency 15-7 1 7 7 0.33 0.33 8-40 34 4 11 0.133 0.367 41-53 47 7 18 0.33 0.600 54-66 60 3 1 0.100 0.700 67-79 73 7 8 0.33 0.933 80-9 86 30 0.067 1.000 Example: Consider the following cumulative frequency distribution. Class Cumulative Frequency 10-15 6 16-1 13-7 38 8-33 4 34-39 50 a) What is the width (or length) of each class? b) Find the relative frequency of the second class. c) Find the proportion of observations that are greater than or equal to 16 and less than or equal 33. a) The class width equals16 10 = 6 (or you may say 15 10 + one accuracy unit = 5 + 1 = 6) b) The frequency of the second class equals 13 6 = 7 and the total frequency is 50. Thus the relative frequency of the second class equals 7/50 = 0.14. c) The observations that are greater than or equal to 16 and less than or equal 33 are those in the second, third and fourth classes and their frequencies are 13 6 = 7,38 13 = 5 and 4 38 = 4, respectively. Thus their proportion is (7+5+4)/50 = 0.7. The Histogram; The Frequency Polygon: The histogram is a graphical representation of the frequency distribution (or the relative frequency distribution), it reveals the shape of the data, for example the presence or absence of symmetry. When we construct the histogram the boundaries of the class intervals are presented by the horizontal axis, while the vertical axis has as its scale the frequency (or the Page 66 of 41

Page 7 relative frequency). Above each class interval on the horizontal axis a rectangle with height being equal to the frequency (or relative frequency) of the relevant class interval is constructed. All rectangles must be contiguous. The frequency (or relative frequency) polygon is another graphical representation for the frequency (or relative frequency) distribution. To draw a frequency polygon we place a dot above the midpoint of each class interval represented on the horizontal axis in addition to two extra dots on the horizontal axis at the midpoints of two additional class intervals, one is located to the left of the first class and the other is located to the right of the last class, the height of each dot equals the frequency of the relevant class interval and the heights of the extra dots are zero Connecting the dots with line segments produces a frequency polygon. Example: Construct the frequency histogram and the frequency polygon of the following part of a frequency table. Class Intervals Frequency Midpoint Actual limits 15-7 7 1 14.5 7.5 8-40 4 34 7.5 40.5 41-53 7 47 40.5 53.5 54-66 3 60 53.5 66.5 67-79 7 73 66.5 79.5 80-9 86 79.5 9.5 Frequency The following is the histogram 8 7 6 5 4 3 1 0 14.5 7.5 40.5 53.5 66.5 Actual Limits Page 77 of 41 79.5 9.5

Page 8 The following is the frequency polygon 8 7 Frequency 6 5 4 3 1 0 8 1 34 47 60 73 86 99 Midpoint Stem-and-Leaf Display (optional): The stem-and-leaf display is similar to the histogram and has the same purpose, its main advantage over the histogram is that it preserves the information contained in the individual data items. It is effective with relatively small data sets. To construct a stem-and-leaf plot we : 1. partition each datum into two parts; the leaf which consists of the units digit and the stem which consists of the rest digits of the datum. on the left hand side of the page write down the stem 3. draw a line to the right of these stems 4. on the other side of the line, write down the leaves of all data with the same stem on the left. The stems of the data should form an ordered column with the smallest stem at the top and the largest at the bottom. All the stems within the range are included in the stem column even if no data with that stem is within our data items. Decimals when present in the original data are omitted in the stem-and-leaf display. If all data items are fractions less than one the we can magnify the data through multiplying each data item by a number (10, 100, 1000 etc.) before we display the data in a stem-and-leaf plot.. Example: Display the following data in a stem-and-leaf plot,3,6,7,1,15,15,15,17,0,0,1,9,9,34,51,56,60,65,69,80,89 Page 88 of 41

Page 9 Solution : Stems Leaves 0 3 6 7 1 5 5 5 7 0 0 1 9 9 3 4 4 5 1 6 6 0 5 9 7 8 0 9 Reading Assignment: Chapter (.1,.,.3) in W.W. Daniel Descriptive Statistics Measures of Central Tendency A descriptive measure is a single number that is used to summarize the data. Descriptive measures may be computed from the data of a sample or the data of a population. Definition: 1. A descriptive measure computed from the data of a sample is called a statistic.. A descriptive measure computed from the data of a population is called a parameter. Arithmetic Mean: The arithmetic mean of a sample is denoted by x and of a population is denoted by. From now on we will just say the mean for the arithmetic mean. 1) For raw (unorganized) data: n x x i 1 i, where x1, x,..., xn are the observations in the sample and n is n their number ( the sample size ). N x i 1 N i, where x1, x,..., xn are the observations in the population and N is their number ( the population size ). Page 99 of 41

Page 10 ) For frequency tables: =,,,, where x1, x,..., xn are the observations (or midpoints) and are their corresponding frequencies. Properties of the Mean: 1. Uniqueness : for a given set of data there is one and only one mean.. Simplicity: it is so easy to compute the mean of any sample. 3. The value of each data item has an influence on the mean, thus the mean is affected by extreme values, this makes the mean, in some cases, not a good representative of the tendency of the values of the majority of the data.. Example: The mean of the data 50,49,53,48,54,40 equals (50+49+53+48+54+40)/6= 11.333; a number which does not represent the tendency of the data, however if we trim out the observation 40 then the mean becomes (50+49+53+48+54)/5 = 50.8. Notice the influence of the observation 40 on the value of the mean. Example(part 1 is optional): Compute the mean for the following two data sets. 1) Stem Leaf 0 1,, 5 1 0, 1 1, 1, 1, 3 0, 1,, ) Class Frequency 0 3 3 5 6 8 1 9 11 3 1 14 1) mean = sum of observations/ number of observations = (1++5+10+11+1+1+1++30+31+3+3)/13 = 39/13 = 18.3946 Page 10 10 of 41

Page 11 ) 1 Frequency ( ) 3 3 4 8 7 1 7 10 0 13 6 Total 10 64 Midpoint ( ) mean = 64/10 = 6.4 The Median: The median of a finite set of observations is the value which divides the set into two equal parts such that the number of values equal to or greater than the median is equal to the number of values equal or less than the median. The median will be the middle value (or the average of the two middle values) when all values have been arranged in order of magnitude. Example : Find the median of the following observations 45,78,3,54,61,1,90,46,68,45,11 The first step will be arranging the data in order of magnitude 11, 1, 3, 3, 45, 45, 46, 54, 68, 78, 90 Notice that 45 is located exactly in the middle of all ordered values, thus the median is 45. Example : Find the median of 65, 78,94,5,3,56,66,38,78,3,80 We order the data as a first step 3, 3, 5, 38, 56, 66, 78, 78, 80, 94 Notice that no single datum is located in the middle of the ordered data because the number of data items is even, however the two values 56 and 66 are located in the middle, thus the median equals (56+66)/= 61. Properties of the Median: 1. Uniqueness. Simplicity 3. Unlike the mean, it is not drastically affected by extreme values. Page 11 11 of 41

Page 1 The Mode: A mode of a set of observations is an observation that has the largest frequency. If all observations have the same frequency. A data set may have more than one mode. The mode may be used to describe qualitative data. A mode of grouped data is estimated by the midpoint of a class with the highest frequency. Example: The following table represents the nationalities of a sample of 10 patients who had psychotherapy last year in a private clinic British, French, American, American, Dutch, British, Spanish, South African, French, American To find the mode of the above nationalities we make the following table Nationality Frequency American 3 British Dutch French Spanish 1 South African 1 Notice that the most frequently occurring nationality is American, thus the mode is American. Example: Find the mode of the data 8, 8, 8, 8, 8, 9, 30, 31, 3, 3, 3, 3, 3, 36, 39, 4, 44, 44,45 There are two modes for the above data namely 8 and 3 because they have the same highest frequency. Reading Assignment: Chapter (.4) in W.W. Daniel. Page 1 1 of 41

Page 13 Descriptive Statistics Measures of Dispersion The dispersion of a set of data (or observations) refers to the variety that they exhibit. A measure of dispersion provides information about the amount of variability present in a set of data. When the dispersion is "small", the values of the data items are "close" together. The following graph represents two frequency polygons for population A and population B with the same mean notice that population B exhibits more dispersion because the values of its observations are more spread out. Dispersion can be measured using one of the following measures: The Range: The range of a set of values is given by R largest value smallest value. The range is so simple to compute, but it is not usually used as a reliable measure of dispersion because it is drastically affected by extreme values. The Variance: 1) For raw data: the variance of the sample x1, x,...xn is given by n s ( x x) i i 1 n 1 where x is the mean of the sample. One can easily show that the above formula for the variance has also following form = 1 1 =1 which is easier for computations with calculators. The populations variance is given by = ( ) = ( where N is the populations size and is the mean of the population. Page 13 13 of 41 )

Page 14 = ( ) For frequency tables: where,,, 1 =1 ) 1 =1 are the observations (or the midpoints) and corresponding frequencies. ( =1 ),,, are their The Standard Deviation: The variance represents squared units and, therefore, is not an appropriate measure of dispersion when we want to express it in terms of the original units. To obtain a measure of dispersion in the original units, we take the square root of the variance, which we refer to as the standard deviation. The standard deviations of a sample and the population is denoted by s and respectively. = and = Example: Find the mean and standard deviation of each of the following samples i) 4, 8, 8, 61, 31, 3, 50, 34, 3, 37 ii) Class Frequency 0 1 1 3 4 6 3 7 9 10 1 1 i) 4 1764 8 784 8 784 61 371 31 961 3 59 50 500 34 1156 3 104 37 1369 366 1459 Total Page 14 14 of 41

Page 15 Thus, = i) Midpoint = = 36.6 and =. = 11.597 Frequency 1 1 1 1 1 4 4 8 5 3 15 5 75 8 16 64 18 11 1 11 11 11 Total 9 45 not needed 333 = 5 and = The Coefficient of Variation = 3.6743 The coefficient of variation, denoted by C.V., is a unit free measure that is used to compare the amount of dispersion between two different sets of data with (possibly) different means and x 100 different units. The coefficient of variation is given by C.V. s Example: The following table summarizes the data collected about the weights of two samples of human males Sample 1 Sample Age 5 years 11 years Mean Weight 145 pounds 80 pounds Standard Deviation 10 pounds 10 pounds Which of the samples is more dispersed? To compare dispersion we compute the C.V. for each sample. C.V. for sample 1 = 10 100 6.9 145 C.V. for sample = 10 100 1.5 80 Since the C.V. of sample is greater than the C.V. of sample 1, sample is more dispersed. Page 15 15 of 41

Page 16 Percentiles and Quartiles Percentiles and quartiles are used to indicate certain positions (or locations) of the observations (or data). The pth percentile is denoted by Pp ; it is the number P such that (almost) p% of the observations are less than or equal to P. The 5th percentile is also denoted by Q1 and is also called the 1st quartile. The second quartile Q is the 50th percentile (the median) while the 3rd quartile Q3 is the 75th percentile. Computing percentiles: 1) For ungrouped data, the pth percentile is thought of to be the p ( n 1) th ordered 100 observation. Thus Q1 is the 0.5 ( n 1) th ordered observation Q is the 0.5 ( n 1) th ordered observation Q3 is the 0.75 ( n 1) th ordered observation The pth percentile for ungrouped data is computed using the formula: +( )( ), where = ( + 1) and is the floor of and n is the number of observations (or total frequency). Before you apply the formula, make sure that the observations are written in an ascending order. ) For grouped data. Think of the pth percentile to be the observation that has cumulative frequency, where is the total frequency. Find the first class that has cumulative frequency greater than or equal values and cumulative frequencies of the, say. Use the class to approximate the required percentile linearly as shown in the example. Interquartile Range: The interquartile range is denoted by IQR. It is given by IQR Q3 Q1 Example: Find, the median,, and IQR for the following observations 3, 1, 54, 43, 51, 17, 3, 19, 14,, 5, 8, 33, 4, 6, 38, 50 We start with putting the above data in ascending order: 1, 14, 17, 19,, 3, 5, 6, 8, 3, 33, 38, 4, 43, 50, 51, 54 The number of observations is n 17. Page 16 16 of 41 of this

Page 17 The first quartile Q1 is the 0.5 (17 1)th ordered observation, i.e. Q1 is the 4.5th observation. Now, the 4th observation is 19 and the 5th observation is, hence the 4.5th observation is + 0.5( ) = 19 (0.5 ( 19)) = 0.5. The median is the 0.5 (17 1)th ordered observation, i.e. it is the 9th observation, namely 8. Q3 is the 0.75 (17 1)th observation, i.e., it is the 13.5th observation, namely 4 (0.5 (43 4)) = 4.5 P60 is the 0.6 (17 1)th observation, i.e., it is the 10.8th observation, namely 3 (0.8 (33 3)) = 3.8 IQR Q3 Q1 = 4.5 0.5 =. Example: Find the median and the 80th percentile of the following data. The total frequency To find the median. observation, which is Frequency 3 5 4 9 3 1 5 17 4 Total 18 = 18. 19 = 9.5. Thus, the median is the 9.5th ordered + 0.5( To find the 80th percentile. ordered observation, which is 17. x ) = 9 + 0.5 (1 9) = 10.5 19 = 15.. Thus, the 8th percentile is the 15.nd + 0.( Page 17 17 of 41 ) = 17 + 0. (17 17) =

Page 18 Example: Find the median for the following grouped data. Class Frequency 6 3 7 11 5 1 16 7 17 1 Total 17 = 17, = 50 17 = 8.5. The first class that has cumulative frequency 8.5 is 1 16 and the actual limits of this class are 11.5 and 16.5, respectively.? = 11.5 + 0.5 = 11.8571 = median. Example: Consider the following table of grouped data Class 10 15 16 1 7 8 33 34 39 Total Frequency 3 5 4 16 Estimate the proportion of observations that are less than 4. Page 18 18 of 41

Page 19 The observation 4 belongs to the class 7. Let p be the required proportion. Box-and-Whisker Plots (Box plots) (Optional) : A box-and-whisker plot (or simply a box plot) is a useful visual device for demonstrating the information contained in a data set. It reveals information regarding the amount of spread, location of concentration, and symmetry of the data. The construction of such a plot makes use of the quartiles of a data set and may be accomplished by the following steps: 1. Represent the data on the horizontal axis.. Draw a box in the space above the horizontal axis in such a way that the left end of the box aligns with the first quartile Q1 and the right end of the box aligns with the third quartile Q3. 3. Divide the box into two parts by a vertical line that aligns with the median Q. 4. Draw a horizontal line called a whisker from the left end of the box to a point that aligns with the smallest measurement in the data set. 5. Draw another horizontal line, or whisker, from the right end of the box to a point that aligns with the largest measurement in the data set. Example: Construct a box-and-whisker plot for the data in the previous example. Page 19 19 of 41

Page 0 Reading Assignment: Chapter (.5) in W.W. Daniel. Chapter 3 Some Basic Probability Concepts Elementary Properties of Probability: A random experiment is an experiment whose outcome is a random variable, i.e., can not be predicted with certainty. The sample space of a random experiment is the collection of all possible values of its outcome. An event is a subcollection of the sample space. The empty event is denoted by, it is the event of having no outcomes. The probability of an event E is denoted by P(E). It is a nonnegative number, less than or equal to 1 that measures the likelihood of the occurrence of the event E. Example: The following is the sample space of the experiment of tossing a coin: S H, T where H stands for head and T stands for tail. The following is the collection of all possible events of the experiment of tossing a coin:,{h },{T },{H, T } Example : Find the sample space and five different events of the experiment of tossing a coin times. Solution : S ( H, H ), ( H, T ), (T, H ), (T, T ) Page 0 0 of 41

Page 1 The following are events of the experiment E1 ( H, T ), ( H, H ), E ( H, T ),( H, H ),(T, T ), E3 ( H, H ), E4 S ( H, H ), ( H, T ), (T, H ), (T, T ), E5 Definition: If every possible value of the outcome of a random experiment has the same chance to occur then the experiment is said to be equally likely. If an experiment is equally likely and has a finite sample space S, then the probability of an event E this experiment is given by P ( E ) E, where stands for the number of elements and S is the sample S space of the experiment. Example: Find the probability of having a total number of dots greater than 4 if a pair of fair dice are rolled. The sample space of the experiment of rolling a pair of dice is S (1,1), (1, ), (1,3),..., (1, 6), (,1), (, ),..., (, 6),...(6, 6) The mentioned event is the following E (1, 4), (1,5), (1, 6), (,3), (, 4), (,5), (, 6), (3, ),..., (3, 6), (4,1),..., (4, 6), (5,1),..., (5, 6), (6,1),..., (6, 6) Notice that S 36 and E 30. Thus P ( E ) 30 5 36 6 Conditional Probability: If A and B are events then by P ( B A) we denote the probability of occurrence of the event B given that the event A has occurred. It is called a conditional probability and it is read " probability of B given A" Elementary Properties: 1. for any event E, 0 P ( E ) 1. P ( ) 0 and P ( S ) 1 3. if S s1, s,..., sn then P ({s1}) P ({s })... P ({sn }) 1 4. if 5. 6. 7. ( ( then ( ) ( ) )= ( )= 1 ( ) )= ( )= ( )+ ( ) ( ( )= ( ) )= ( )+ ( ) ( ) ( )= ( ) Page 1 1 of 41

Page Example: The following table represents the frequency of cocaine use by gender among 111 adult cocaine users (in the US) Life time frequency of cocaine use Male (M) Female (F) Total 1 19 times (A) 3 7 39 0 99 times (B) 18 0 38 100 + times (C) 5 9 34 Total 75 36 111 1. What is the probability that a randomly selected user will be a male?. If we pick a person at random from the 111 group and found out that he is a male (M), what is the probability that he used cocaine 100 + times (C)? 3. What is the probability that a randomly selected person from the 111 group is a male (M) and a person who used cocaine 100 + times (C)? 4. What is the probability that a randomly selected person from the 111 group is a female (F) or a person who used cocaine 0-99 times (B)? 5. What is the probability that a randomly selected person from the 111 group is not a a person who used cocaine 100 + times (C)? 1. P ( M ) M 75 111 111. We use the notation P (C M ) to denote the probability of the event C given that the event M has occurred. It is read "probability of C given M. Knowing that the selected person is a male reduces our sample space to the group of males only, thus P (C M ) 3. P ( M and C ) 4. 5. ( ( C " for males " 5 M 75 M and C 5 111 111 )= ( )+ ( ) ( ) = ( ) = 1 ( ) = 1 )= = + = Example: In a group of people, 5% have both diabetes and hypertension, 4% have hypertension, and 35% have diabetes. A person is selected at random from this group. What is the probability that this person a. is diabetic or hypertensive? Page of 41

Page 3 b. does not have hypertension? c. is not diabetic and does not have hypertension? a. b. ( )= ( ( ( ) = 0.35 + 0.4 0.5 = 0.5 =1 ( c. )+ ( ) = 1 0.4 = 0.58 = ) = 1 0.5 = 0.48 ) =1 Calculating the Probability of an Event; Conditional Probability : Recall that by P ( B A) we denote the probability of occurrence of the event B given that the event A has occurred. The conditional probability P ( B A) can be computed using the formula P( B A) P( A and B) P( A B) P( A) P( A) Thus P ( A and B ) P ( A B ) P ( A) P ( B A) Example: Let A, B be two events such that P(A) = 0.4, P(B) = 0.8 and P(A B) = 0.3. Find ( ). ( ) ( ) =. Use the following table to find the value of each of these ( ) quantities. Total Probability Thus ( ) = ( ) ( ) =.. Total Probability 0.3 0.5 0.8 0.1 0.1 0. 0.4 0.6 1 = 0.65 Definition: The events A and B are independent if P( Aand B) P( A B) P( A) P(B) Equivalently, if P(A) > 0 and P(B)>0 then the events A and B are independent if P ( B A) P ( B ) (and P ( A B ) P ( A) ) Example: In a group of people, 5% have both diabetes and hypertension, 4% have hypertension, and 35% have diabetes. a. What is the percent of those people that have hypertension also have a diabetes? b. For that group of people, are the events "Diabetic" and " Hypertensive" independent? Page 3 3 of 41

Page 4 a. P (diabetic hypertensive ) = P (diabetic and hypertensive ) P ( have hypertension ) = 0.5 0.595 0.4 Thus the percent of those people that have hypertension also have a diabetes is 59.5%. b. The events "Diabetic" and " Has Hypertension" are not independent because P (diabetic hypertensive ) 0.595 P (diabetic ) 0.35 Fact: If and and, and. Example: Let, are independent then the following events are also independent: and, be two independent events such that ( ) = 0.4 and ( ) = 0.. Find i) ( ) ii) ( ). Since and are independent, and, and are independent. Thus: ( ) = ( ) ( ) = (1 0.4)0. = 0.1 i) ) ( ) = ( ) = 1 0. = 0.8 ii) Definition: : The events A and B are mutually exclusive if P ( A B ) P ( A) P ( B ). Equivalently, the events A and B are mutually exclusive if P ( A B ) 0. Example: if a person (in the above example) is selected at random, d. what is the probability that this person is diabetic or hypertensive? e. are the events "Diabetic" and " Hypertensive" mutually exclusive? P (diabetic or hypertensive ) a. P (diabetic ) P (hypertensive ) P (diabetic and hypertensive ) 0.35 0.4 0.5 0.5 b. The events "Diabetic" and " Hypertensive" are not mutually exclusive because P (diabetic and hypertensive ) 0 Example: if a person (in the above example) is selected at random, what is the probability that this person: a. does not have hypertension b. is not diabetic and does not have hypertension a. P ( has hypertension ) 1 P ( has hypertension ) 1 0.4 0.58 Page 4 4 of 41

Page 5 P (diabetic and hypertensive ) P (diabetic or hypertensive ) b. 1 P (diabetic or hypertensive ) 1 0.5 0.48 Bayes s Theorem. Screening Tests, Sensitivity, HANDOUT IS NOT AVAILABLE. READ FROM YOUR MAIN REFERENCE. Reading Assignment: Chapter 3 (3.1,3.,3.3,3.4, 3.5) in W.W. Daniel. Page 5 5 of 41

Chapter 4 Probability Distributions University of Jordan Fall 008 / 009 Department of Mathematics Chapter 4 Probability Distributions The Distribution of a Discrete Random Variable: The distribution of a discrete random variable X is a table, a graph or a formula that is used to specify all possible values of X along with the probability of each one of these possible values. Example: Consider the following distribution of a discrete random variable X. Find: 1) P(X is odd) ) P(X is even X > 0) k P(X = k) 0 0. 1 0.3 0.1 3 0.4 Total 1 1) P(X is odd) = P(X = 1 or X = 3) = P(X = 1) + P(X = 3) = 0.3 + 0.4 = 0.7 ) P(X is even X > 0) = P(X is even and X > 0) / P(X > 0) = P(X = ) / (1 P(X = 0)) = 0.1 / 0.8 = 0.15 The Expected Value (Mean) and Variance of a Discrete Random Variable: The expected value (or the mean) of a discrete random variable X is denoted by E(X) (or ) and is given by, where the sum runs over all possible values of the random variable. The variance of is given by, where Example: Find and for the random variable given in the above example. k P(X = k) P(X = k) P(X = k) 0 0. 0 0 1 0.3 0.3 0.3 0.1 0. 0.4 3 0.4 1. 3.6 Total 1 1.7 4.3 1.7and 4.3 1.7 1.41

Chapter 4 Probability Distributions 331331 Lecture #9 The Binomial Experiment and Distribution: Before we introduce the binomial (or Bernoulli) experiments we introduce some notations for some relevant mathematical quantities. 1. The Factorial of a Nonnegative Integer : if n is a nonnegative integer then by n! we denote 1 if n = 0 what refers to " nfactorial" defined by n! = n ( n 1) ( n )... 1 if n > 0 Remark: for any n 1, n! = n ( n 1)! Example: 0! = 1, 1! = 1,! =, 3! = 3 1 = 6, 4! = 4 3! = 4,.... Combinations: : if n is a positive integer and k is an integer such that 0 < k nthen the combination Example: n n n! is defined by = k k k! ( n k)! 10 10! = = 1 10 10! 0! 10 10! = = 1 0 0! 10! 10 10! 10 9! = = = 10 1 1! 9! 1 9! 10 10! 10 9 8 7 6! = = = 10 3 7 = 10 4 4! 6! 4 3 6! n Fact:The number of ways of selecting k objects from n objects is given by. k Example: How many teams of 6 players can we choose out of a group of 8 people? 8 8! 8 7 6! Answer: = = = 8 teams. 6 6!! 6! Example: In how many ways can we choose 3 balls from an urn that contains 5 balls. Answer: 5!! 10 ways. 3!!! Example: How many events with size 4 are there if the size of the sample space is 6? 6 6! 6 5 4! Answer: = = = 15 events. 4 4!! 4!

Chapter 4 Probability Distributions The binomial (or Bernoulli) experiment : A binomial (or Bernoulli) experiment is a random experiment that has the following properties: 1) has exactly one of two possible outcomes, one is referred to as success and the other is referred to as failure. ) the probability of success in each trial of the experiment is constant, usually denoted by. 3) all trials of the experiment are independent. Examples: 1. Tossing a coin. The outcome is either a head or a tail.. Checking whether a new born is a boy or a girl 3. Checking whether a person is diabetic or not The Binomial Random Variable: The binomial random variable is the number of successes when a binomial experiment, with probability of success in each trial, is performed times. We denote it by ~,. The possible values of are 0,1,,. Examples: 1. Select a random sample of 10 people. Let be the number of diabetics within this sample. Then ~ 10,, where is the proportion of diabetics in the population from which the sample is selected. The possible values of are 0,1,,,10.. Toss a fair coin 0 times. Let be the number of times a head comes out. Then ~ 0,0.5. The possible values of are 0,1,,,0. Fact: If ~, then 1) for each 0,1,..,, 1 ) 3) 1 Example: Let ~ 5,0.3. Find: 1) ) 3) 1) 5 0.3 0.7! 0.343 0.3087 ) 5 0.3 1.5!! 0.09 0.343 10 0.09 3). Thus 5 0.7 1.5 3.3 0.3

Chapter 4 Probability Distributions 331331 Lecture #9 Example: Let ~,0.5. Exhibit the distribution of as a table. 0 0.5 1 0.5 0.5 Total 1 Example: Suppose that the probability that a patient suffering from migraine headache pain will obtain a relief with a particular drug is 0.9. Three randomly selected sufferers from migraine headache are given this drug. Find the probability that the number of sufferers in the selected sample obtaining relief will be: 1) Exactly zero ) At least one 3) Two or three 4) At most two Let be the number of sufferers in the selected sample obtaining a relief. Then ~ 3,0.9. 1) 0 3 0 0.9 0.1 0.1 0.001 ) 1 1 0 1 0.001 0.999 3) 3 3 0.43 0.79 0.97 4) 1 3 1 0.79 0.71 Note: The binomial distribution is completely determined by and. They are called the parameters of the binomial distribution Binomial Tables: When is large, the calculations of binomial probabilities using the equation can be tedious. We may bypass these tedious calculations through using a binomial table. Binomial tables enable us to read the value of for any 0,1,,.

Chapter 4 Probability Distributions 331331 Lecture #9 The following is a part of the binomial table for 10,. Example: Let ~ 10,0.3. Use the above table to find: 1) 4 ) 4 3) 4 4) 4 5) 4 6) 6 7) 6 8) 6 9) 6 1) 4 0.850 ) 4 3 0.650 3) 4 4 3 0.850 0.650 0.00 4) 4 1 4 1 0.850 0.150 5) 4 1 4 1 3 1 0.650 0.350 6) 6 3 5 5 0.953 0.383 0.570 The rest are left as an exercise. Reading Assignment: Chapter 4 (4.1,4.,4,3) in W.W. Daniel, 7 th edition.

Page 31 Chapter 4 Probability Distributions Page 31 6 of 41

Page 3 The Poisson Random Variable: The Poisson random variable is the number of occurrences of a rare event in an interval of time or a space unit. If is the average (or expected) number of occurrences of this event in the time (or space) unit then we write ~ The possible values of Fact: If ~ 1) for each ) 3) ( )= = 0,1,..,, ( Example: Let ~ ) = )=!, where ) ( > 0) = 1 ( 0) = 1 ( = 0) = 1! ( )= ( (3). Find: 1) ( > 0).71 ) ( 1) are 0,1,, ( ) then ( )= ( ). ) ( ( )). Thus ( )= Page 3 7 of 41 =1 ( ) + ( ( )) = 3 + 3 = 1

Page 33 Example: The number of cases admitted to the CCU in a certain hospital is distributed according to a Poisson distribution with average 3 cases per day. Find the probability of admitting 5 case to the CCU in this hospital in a random week. Let be the number of cases admitted to the CCU in this hospital in a (3 7) = week. Then ~ 0.055546 (1). Thus, ( Note: Poisson distribution is completely determined by = 5) =! It is called the parameter of the Poisson distribution Poisson Tables: Poisson tables enable us to read the value of ( ) for any = 0,1, when ~ ( ) for several values of. The following is a part of a Poisson table for Exercise: Let ~ 1) ) 3) 4) 5) 6) 7) 8) 9) ( 3) ( = 3) ( > ) ( ) ( < < 5) ( 5) ( ( < ( ). (1.5). Use the above table to find: ( < 3) < 5) 5) Reading Assignment: Chapter 4 (4.4) in W.W. Daniel. Page 33 8 of 41 =

Page 34 The Normal Distribution: Normal distribution is probably one of the most important and widely used continuous distributions. A normally distributed random variable is known as a normal random variable. The following are the properties of the normal distribution: Properties of the Normal Distribution: 1. It is bell shaped and is symmetrical about its mean.. Its mean equals its median equals it mode.. 3. It is a continuous distribution. 4. It is completely determined by its mean and its variance. A normal random variable X with mean and variance is expressed as ~ (. ) 5. The total area under the curve equals 1. Thus, the area of the distribution on each side of the mean is 0.5. 6. The probability that the normal random variable will have a value between any two points is equal to the area under the curve between those points. Page 34 9 of 41

Page 35 The curve on the right is skewed to the right. Its mode < its median < its mean. The one on the left is skewed to the left. Its mode > its median > its mean. To find the probability that a normal random variable X will have a value smaller than a given number, we transform the normal random X to the standard normal random variable Z that has mean 0 and variance 1. This transformation is done using the formula =. A standard Z table can be used to find probabilities for any normal curve problem that has been converted to Z scores. The following steps are helpful when working with the normal curve problems: 1. Graph the normal distribution, and shade the area related to the probability you want to find.. Convert the boundaries of the shaded area from X values to the standard normal random variable Z values using the Z formula above. 3. Use the standard Z table to find the probabilities or the areas related to the Z values in step. Example: The weights of 1000 children are normally distributed with mean 5 kg and standard deviation 5 kg. 1) Find the proportion of children that have weights between kg and 8 kg. ) About how many children have weights smaller than 30 kg? 3) If a child is randomly selected, find the probability that her/his weight is smaller than 8. 4) Find the third quartile of the weights of these children. 5) Find a positive number C such that 68% of the children have weights between 5 C and 5+C. Page 35 30 of 41

Page 36 Let X represent the children s weights. Then ~ (5, 5 ). 1) To find ( < < 8). 5 8 5 < < = ( 0.6 < < 0.6) 5 5 = ( < 0.6) ( < 0.6) = 0.757 0.743 = 0.4514 ( < < 8) = ) ( < 30) = = ( < 1) = 0.841 < Thus, about 0.841 1000 = 841 children have weights less than 30 kg. 3) Find ( < 8) (Exercise) 4) The third quartile is nothing but ( < ) = 0.75. Thus, we find that 5) (5 < < which is characterized by the property = 0.75. From the standard normal table 0.67. Hence, = 5 0.67 + 5 = 8.35 kg. < 5 + ) = 0.68 < < Reading Assignment: = 0.68 < = 0.84 < < =1 = 5. = 0.68 Chapter 4 (4.6,4.7) in W.W. Daniel. Chapter 5 Some Important Sampling Distributions Introduction: A statistical measure for a sample is called a statistic and a statistical measure for a population is called a parameter. Example of statistics are, s,. The following are parameters, σ,. A statistic is a random variable but a parameter is not. Sample statistics like x and s are used to estimate population parameters like and, respectively. There is some difference (or error ) between statistics and parameters. Different samples from the same population may have different amounts of sampling error. Studying sampling distributions of sample statistics helps us understand statistical inference and allows us to answer questions about sample statistics. Sampling Distributions : The sampling distribution of a statistic is the distribution of the values taken by that statistic in all possible samples of the same size that are drawn from the same population. Page 36 31 of 41

Page 37 Note : The number of all possible samples of size n, drawn without replacement from a N N! population of size N, equals. If we allow replacement then the number n n! ( N n)! of all possible samples is N n. Example : The following table gives all possible samples of size drawn with replacement from a population that comprises the weights ( in pounds ) of 5 children together with the mean of each sample Population data : 65 54 67 65 88 Population 65 54 67 65 88 65 (65,65), 65 (54,65),59.5 (67,65),66 (65,65),65 (88,65),76.5 54 (65,54),59.5 (54,54),54 (67,54),60.5 (65,54),59.5 (88,54),71 67 (65,67),66 (54,67),60.5 (67,67),67 (65,67),66 (88,67),77.5 65 (65,65),65 (54,65),59.5 (67,65),66 (65,65),65 (88,65),76.5 88 (65,88),76.5 (54,88),71 (67,88),77.5 (65,88),76.5 (88,88),88 The following chart represents the above samples' means Page 37 3 of 41

Page 38 Sampling Distribution of the Mean: Theorem: The sampling distribution of x in a normally distributed population with mean and standard deviation is also normally distributed with mean and standard deviation n,where n is the sample size, provided that sampling is performed with replacement. If sampling is performed without replacement then the sampling distribution is also normally distributed with mean and standard deviation The factor n N n, where N is the size of the population. N 1 N n is called the correction factor. It is negligible if n 0.05 N or N 1 N is very large (infinite or practically infinite). The Central Limit Theorem (CLT) : When the sample size is large ( n 30 ), the above Theorem is also valid even if the population is not normally distributed. In fact the sampling distribution of the mean is almost normal when n is large.the larger the sample size, the closer the sampling distribution of the mean to being normally distributed. Example: Suppose that the ages of Jordan University students follow a normal distribution with mean 0.5 years and standard deviation 1.4 years. If we repeatedly collect samples of size n 49 : a) what is the sampling distribution of x? Answer: ~ 0.5, (. ) ~ (0.5,0.04)~ (0.5, (0.) ) b) what is the probability that the mean age of a randomly selected sample of size 49 of Jordan University students is smaller than 1 years? Answer: P ( x 1) P ( Z 1 0.5 ) P ( Z.5) 0.9938 0. c) what is the probability that an individual student is younger than 1 years old? Answer: thus ~ (1.5, (1.4) P ( x 1) P ( Z 1 0.5 ) P ( Z 0.36) 0.6406 1.4 d) what is the distribution of x if the ages of Jordan University students do not follow a normal distribution? Page 38 33 of 41

Page 39 Answer : The distribution of x will be approximately normal with mean 0.5 and standard deviation 0. since the sample size is > 30, Reading Assignment: Chapter 5 (5.1,5.,5.3) in W.W. Daniel. Distribution of the Difference Between Two Sample Means: Suppose that we want to know whether or not the mean serum cholesterol level is higher in a population of sedentary office workers than in a population of laborers. If we know that those means are different then we may wish to know by how much they differ. One way is to take a random sample from each population then look at the sampling distribution of x1 x to answer probability questions and draw statistical inference. Sampling Distribution of x1 x : Theorem: If we draw two independent random samples of sizes n1 and n from two distinct normally distributed populations, having means 1, and standard deviations 1 and, respectively, then x1 x is normally distributed with mean x x 1 and standard 1 deviation x x 1 1 n1 n Note: The above theorem is also valid if the populations are not (both) normally distributed provided that both n1 and n are greater than or equal to 30. Example: One group on a diet lost an average of 7. kg with standard deviation 3.7 kg., another group on sportive exercises lost an average of 4.0 kg with a standard deviation of 3.9 kg. Suppose we collect samples of sizes n1 4 from the diet group and n 47 from the exercises group : (a) what is the sampling distribution of x1 x? Answer: the sampling distribution of x1 x is approximately normal ( since n1 30 and n 30 ) with mean 7. 4.0 3. kg and standard deviation (3.7) (3.9) 0.806 kg 4 47 (b) what is the probability that the difference between mean weight loss of the two groups is larger than 4.0 kg? Page 39 34 of 41

Page 40 4.0 3. Answer: P x1 x 4.0 P Z 0.993 0.806 1 P Z 0.99 1 0.8389 0.1611 (c) what is the probability that the mean weight loss of the exercises group is larger than 4.0 kg? ~ Answer : 4.0,. = (4.0, (0.569) ), thus P(x 4.0) P Z 4.0 4.0 0 0.5 (d) Find the IQR (interquartile range) of ( = ( Thus, < = <.. ) = 0.75 < ) = 0.5 < 3. = 0.75 0.806 = 0.675 0.806 + 3. = 3.744 0.569 3. = 0.5 0.806 = 3. 0.675 0.806 =.656 = 3.744.656 = 1.088 3. = 0.675 0.806 3. = 0.675 0.806 Distribution of the Sample Proportion: In this section we study the distribution of sample proportion. Such distribution helps us answer probability questions about proportions when it is tedious, difficult or practically impossible to use binomial tables. For example, suppose that in a certain population 0.08 percent are color blind, if we randomly select 1500 individuals from this population, what is the probability that the proportion of color blinds in that sample is at least 0.10. To answer such question using binomial tables we need to find the probability that the variable x is greater than or equal to 0.10 1500 150 given that x is binomially distributed with p 0.08 and n 1500. How would we answer that question if we don't have binomial tables for n 1500 (or even for any n 5)? Distribution of Sample Proportion; An Empirical Rule: When the sample size is "large" (we will see shortly what large means), the distribution of sample proportions is approximately normally distributed with mean equal to the true population proportion p and standard deviation equal to p (1 p ). The sample is considered "large n enough" if np 5 and n (1 p ) 5. Page 40 35 of 41

Page 41 Example: Suppose that in a certain population 0.08 percent are color blind, if we randomly select 1500 individuals from this population. Find: a) the probability that the proportion of color blinds in that sample is at least 0.10. b) the 95th percentile a) of. p 0.08 and n 1500. Since np 1500 0.08 10 5 and n (1 p ) 1500 0.9 1380 5, the proportion of color blinds is approximately normally distributed with mean p 0.08 and standard deviation p(1 p) 0.08 0.9 0.007 n 1500 Thus b) ( < ) = 0.95 < 1.65 0.007 + 0.08 = 0.09155.. = 0.95.. = 1.65 Distribution of the difference between two sample proportions HANDOUT IS NOT AVAILABLE. READ DIRECTLY FROM YOUR MAIN REFERENCE. Reading Assignment: Chapter 5 (5.1,5.,5.3,5.4,5.5,5.6) in W.W. Daniel. Page 41 36 of 41 =