CORRELATION AND REGRESSION

Similar documents
Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Random Variables, Sampling and Estimation

Expectation and Variance of a random variable

1 Inferential Methods for Correlation and Regression Analysis

Topic 9: Sampling Distributions of Estimators

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

Median and IQR The median is the value which divides the ordered data values in half.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Chapter 1 (Definitions)

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics


Elementary Statistics

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Topic 8: Expected Values

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Lecture 5. Random variable and distribution of probability

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Parameter, Statistic and Random Samples

Final Review for MATH 3510

STAT Homework 1 - Solutions

CHAPTER SUMMARIES MAT102 Dr J Lubowsky Page 1 of 13 Chapter 1: Introduction to Statistics

Quick Review of Probability

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Quick Review of Probability

Linear Regression Models

11 Correlation and Regression

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Statisticians use the word population to refer the total number of (potential) observations under consideration

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Properties and Hypothesis Testing

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution

Understanding Samples

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Probability and statistics: basic terms

Module 1 Fundamentals in statistics

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

CH5. Discrete Probability Distributions

M1 for method for S xy. M1 for method for at least one of S xx or S yy. A1 for at least one of S xy, S xx, S yy correct. M1 for structure of r

7.1 Convergence of sequences of random variables

Simulation. Two Rule For Inverting A Distribution Function

Lecture 18: Sampling distributions

32 estimating the cumulative distribution function

MEASURES OF DISPERSION (VARIABILITY)

Discrete probability distributions

Parameter, Statistic and Random Samples

STP 226 EXAMPLE EXAM #1

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Chapter 23: Inferences About Means

Test of Statistics - Prof. M. Romanazzi

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Describing the Relation between Two Variables

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

A PROBABILITY PRIMER

Chapter 8: Estimating with Confidence

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

(6) Fundamental Sampling Distribution and Data Discription

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

PRACTICE PROBLEMS FOR THE FINAL

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Chapter 2 Descriptive Statistics

Lecture 7: Properties of Random Samples

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Lecture 1 Probability and Statistics

Homework 5 Solutions

An Introduction to Randomized Algorithms

Joint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }

7.1 Convergence of sequences of random variables

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Lesson 10: Limits and Continuity

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Mathematical Statistics - MS

Chapter 6 Sampling Distributions

z is the upper tail critical value from the normal distribution

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Formulas and Tables for Gerstman

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Analysis of Experimental Data

Statistics 511 Additional Materials

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Chapter 4 - Summarizing Numerical Data

NOTES ON DISTRIBUTIONS

Lecture 19: Convergence

4. Basic probability theory

a is some real number (called the coefficient) other

CS284A: Representations and Algorithms in Molecular Biology

CS 330 Discussion - Probability

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Data Analysis and Statistical Methods Statistics 651

Lecture 4. Random variable and distribution of probability

Transcription:

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 REVISION SHEET STATISTICS 1 (Ed) CORRELATION AND REGRESSION The mai ideas are: Scatter Diagrams ad Lies of Best Fit Pearso s Product Momet Correlatio The Least Squares Regressio Lie Before the eam you should kow: Kow whe to use Pearso s product momet correlatio coefficiet How to use summary statistics such as, 2,, 2 y y, y to calculate S, S yy, S y. Kow how to recogise whe a 1 or 2-tail test is required. What is meat by a residue ad the least squares regressio lie. Scatter Diagrams With Bivariate Data we are usually tryig to ivestigate whether there is a correlatio betwee the two uderlyig variable, usually called ad y. Pearso s product momet correlatio coefficiet, r, is a umber betwee -1 ad +1 which ca be calculated as a measure of the correlatio i a populatio of bivariate data. Perfect Positive Correlatio Positive Correlatio No correlatio r = 1 1 1 8 1 14 1 1 1 1 8 8 4 4 4 5 1 15 25 3 35 4 5 1 15 25 3 35 4 5 1 15 25 3 35 4 Negative Correlatio Perfect Negative Correlatio r.5 r r -.5 18 16 14 12 1 8 4 1 3 4 1 1 8 4 5 1 15 25 3 35 4 r = -1 Beware of diagrams which appear to idicate a liear correlatio but i fact to ot: 1 1 8 4 5 1 15 25 3 35 4 12 1 8 4 1 3 4 Here two outliers give the impressio that there is a liear relatioship where i fact there is o correlatio. Here there are 2 distict groups, either of which have a correlatio.

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 Product Momet Correlatio Pearso s product Momet Correlatio Coefficiet: Sy ( )( y y) r = = 2 2 S S ( ) ( y y) yy S = ( )( y y) = y y y where : S = ( ) = 2 2 2 2 2 2 S = ( y y) = y y A value of +1 meas perfect positive correlatio, a value close to meas o correlatio ad a value of -1 meas perfect egative correlatio. The closer the value of r is to +1 or -1, the stroger the correlatio. Eample A games commetator wats to see if there is ay correlatio betwee ability at chess ad at bridge. A radom sample of eight people, who play both chess ad bridge, were chose ad their grades i chess ad bridge were as follows: Player A B C D E F G H Chess grade 1 187 129 162 149 151 189 158 Bridge grade y 75 1 75 85 8 7 95 8 Usig a calculator: = 8, Σ = 1285, Σy = 6, Σ 2 = 9141, Σy 2 = 55, Σy = 1723 = 1.625, y = 82.5 r = 1723 8 1.625 82.5 =.85 (3 s.f.) 2 2 (9141 8 1.625 )(55 8 82.5 ) Rak Correlatio The Least Squares Regressio Lie yy i= 1 This is a lie of best fit which produces the least possible value of the sum of the squares of the residuals (the vertical distace betwee the poit ad the lie of best fit). Sy Sy It is give by: y y = ( ) Alteratively, y = a+ b where, b=, a = y b S S Predicted values For ay pair of values (, y), the predicted value of y is give by ŷ = a + b. If the regressio lie is a good fit to the data, the equatio may be used to predict y values for values withi the give domai, i.e. iterpolatio. It is uwise to use the equatio for predictios if the regressio lie is ot a good fit for ay part of the domai (set of values) or the value is outside the give domai, i.e. the equatio is used for etrapolatio. The correspodig residual = ε = y The sum of the residuals = Σε = ŷ = y (a + b) The least squares regressio lie miimises the sum of the squares of the residuals, Σε 2. Ackowledgemet: Some material o these pages was origially created by Bob Fracis ad we ackowledge his permissio to reproduce such material i this revisio sheet.

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 REVISION SHEET STATISTICS 1 (Ed) DISCRETE RANDOM VARIABLES The mai ideas are: Discrete radom variables Epectatio (mea) of a discrete radom variable Variace of a discrete radom variable Discrete radom variables with probabilities p 1, p 2, p 3, p 4,, p ca be Illustrated usig a vertical lie chart: Notatio A discrete radom variable is usually deoted by a capital letter (X, Y etc). Particular values of the variable are deoted by small letters (r, etc) P(X=r 1 ) meas the probability that the discrete radom variable X takes the value r 1 ΣP(X=r k ) meas the sum of the probabilities for all values of r, i other words ΣP(X=r k ) = 1 Before the eam you should kow: Discrete radom variables are used to create mathematical models to describe ad eplai data you might fid i the real world. You must uderstad the otatio that is used. You must kow that a discrete radom variable X takes values r 1, r 2, r 3, r 4,, r with correspodig probabilities: p 1, p 2, p 3, p 4,, p. Remember that the sum of these probabilities will be 1 so p 1 + p 2 + p 3 + p 4, +p = Σ P(X=r k ) = 1. You should uderstad that the epectatio (mea) of a discrete radom variable is defied by E(X) = μ = Σ rp(x=r k ) You should uderstad that the variace of a discrete radom variable is defied by: Var(X) = σ 2 = E(X µ) 2 = Σ(r μ) 2 P(X=r) Var(X) = σ 2 = E(X 2 ) [E(X)] 2 Eample: A child throws two fair dice ad adds the umbers o the faces. Fid the probability that (i) P(X=4) (the probability that the total is 4) (ii) P(X<7) (the probability that the total is less tha 7) Aswer: (i) P(X=4) = 3 1 = (ii) P(X<7) = 15 = 5 36 12 36 12 Eample: X is a discrete radom variable give by P(X = r) = k r for r = 1, 2, 3, 4 Fid the value of k ad illustrate the distributio. Aswer: To fid the value of k, use ΣP(X = i ) = 1 Σ P(X = i ) = k k k k + + + = 1 1 2 3 4 12 25 k = 1 k = 12 25 =.48 Illustrate with a vertical lie chart: Eample Calculate the epectatio ad variace of the distributio Aswer: Epectatio is E(X) = μ = ΣrP(X = r) = 1.48 + 2.24 + 3.16 + 4.12 = 1.92 E(X 2 ) = Σr 2 P(X = r) = 1 2.48 + 2 2.24 + 3 2.16 + 4 2.12 = 4.5 Variace is Var(X) = E(X 2 ) [E(X)] 2 = 4.5 1.92 2 =.8136

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 Usig tables: For a small set of values it is ofte coveiet to list the probabilities for each value i a table Usig formulae: r i r 1 r 2 r 3. r 1 r P(X = r i ) p 1 p 2 p 3. p 1 p Sometimes it is possible to defie the probability fuctio as a formula, as a fuctio of r, P(X = r) = f(r) Calculatig probabilities: Sometimes you eed to be able to calculate the probability of some compoud evet, give the values from the table or fuctio. Eplaatio of probabilities: Ofte you eed to eplai how the probability P(X = r k ), for some value of k, is derived from first priciples. Eample: The discrete radom variable X has the distributio show i the table r 1 2 3 P(X = r).15.2.35.3 (i) Fid E(X). (ii) Fid E(X 2 ). (iii) Fid Var(X) usig (a) E(X 2 ) μ 2 ad (b) E(X μ) 2. (iv) Hece calculate the stadard deviatio. r 1 2 3 totals P(X = r).15.2.35.3 1 rp(x = r).2.7.9 1.8 r 2 P(X = r).2 1.4 2.7 4.3 (r μ) 2 3.24.64.4 1.44 5.36 (r μ) 2 P(X=r).486.128.14.432 1.6 (i) E(X) = μ = Σr P(X = r) =.15 + 1.2 + 2.35 + 3.3 = +.2 +.7 +.9 = 1.8 (ii) E(X 2 ) = Σr 2 P(X = r) = 2.15 + 1 2.2 + 2 2.35 + 3 2.3 = +.2 + 1.4 + 2.7 = 4.3 (iii) (a) Var(X) = E(X 2 ) μ 2 = 4.3 1.8 2 = 1.6 (b) Var(X) = E(X μ) 2 =.15(-1.8) 2 +.2(1-1.8) 2 +.35(2-1.8) 2 +.3(3-1.8) 2 =.486 +.128 +.14 +.432 = 1.6 (iv) s = 1.6 = 1.2956 = 1.3 (3d.p.) This is the epectatio (μ) This is E(X 2 ) This is Var(X) = Σ(r μ) 2 P(X=r) Notice that the two methods give the same result sice the formulae are just rearragemets of each other. stadard deviatio (s) is the square root of the variace

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 REVISION SHEET STATISTICS 1 (Ed) EXPLORING DATA The mai ideas are: Types of data Stem ad leaf Measures of cetral tedecy Measures of spread Codig Types of data Categorical data or qualitative data are data that are listed by their properties e.g. colours of cars. Numerical or quatitative data Discrete data are data that ca oly take particular umerical values. e.g. shoe sizes. Cotiuous data are data that ca take ay value. It is ofte gathered by measurig e.g. legth, temperature. Before the eam you should kow: Ad be able to idetify whether the data is categorical, discrete or cotiuous. How to describe the shape of a distributio, say whether it is skewed positively or egatively ad be able to idetify ay outliers. Ad be able to draw a ordered stem ad leaf ad a back to back stem ad leaf diagram. Ad be able to calculate ad commet o the mea, mode, media ad mid-rage. Ad be able to calculate the rage, variace ad stadard deviatio of the data. Shapes of distributios Symmetrical Uiform Bimodal (Uimodal) bimodal does ot mea that the peaks have to be the same height Frequecy Distributios Frequecy distributios: data are preseted i tables which summarise the data. This allows you to get a idea of the shape of the distributio. Grouped discrete data ca be treated as if it were cotiuous, e.g. distributio of marks i a test. Skew Positive Skew Symmetrical Negative Skew Stem ad leaf diagrams A cocise way of displayig discrete or cotiuous data (measured to a give accuracy) whilst retaiig the origial iformatio. Data usually sorted i ascedig order ad ca be used to fid the mode, media ad quartiles. You are likely to be asked to commet o the shape of the distributio. Eample Average daily temperatures i 16 cities are recorded i Jauary ad July. The results are Jauary: 2, 18, 3, 6, -3, 23, -5, 17, 14, 29, 28, -1, 2, -9, 28, 19 July: 21, 2, 16, 25, 5, 25, 19, 24, 28, -1, 8, -4, 18, 13, 14, 21 Draw a back to back stem ad leaf diagram ad commet o the shape of the distributios. Ja July Aswer 9 5 3 1-1 4 The Jauary data is uiform but 6 3 2 2 2 5 8 the July data has a egative skew 9 8 7 4 1 3 4 6 8 9 9 8 8 3 1 1 4 5 5 8 Note. Histograms, stem ad leaf diagrams ad bo plots will ot be the direct focus of eamiatio questios.

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 Cetral Tedecy (averages) Σ Mea: = (raw data) = Σf (grouped data) Σf Media: mid-value whe the data are placed i rak order Mode: most commo item or class with the highest frequecy Mid-rage: (miimum + maimum) value 2 Outliers These are pieces of data which are at least two stadard deviatios from the mea i.e. beyod ± 2s Dispersio (spread) Rage: maimum value miimum value Sum of squares: 2 S = Σ( ) Σ 2 2 (raw data) 2 S = Σ( ) f Σ 2 f 2 (frequecy dist.) S Mea square deviatio: msd = Root mea squared deviatio: rmsd = S Variace: s 2 S = Stadard deviatio: s = 1 S 1 Eample: Heights measured to earest cm: 159, 1, 161, 166, 166, 166, 169, 173, 173, 174, 177, 177, 177, 178, 18, 181, 182, 182, 185, 196. Modes = 166 ad 177 (i.e. data set is bimodal), Midrage = (159 +196) 2 = 177.5, Media = (174 + 177) 2 = 175.5 Mea: Σ 3472 = = = 174.1 Rage = 196 159 = 37 Sum of squares: S = Σ 2 2 = 7886 174.1 2 = 1669.8 Root mea square deviatio: rmsd = S = 1669.8 = 9.14 (3 s.f.) Stadard deviatio: s = S = 1669.8 1 19 Outliers (a): 174.1 ± 2 9.37 = 155.36 or 192.84 - the value 196 lies beyod these limits, so oe outlier = 9.37 (3 s.f.) Eample A survey was carried out to fid how much time it took a group of pupils to complete their homework. The results are show i the table below. Calculate a estimate for the mea ad stadard deviatio of the data. Aswer Liear codig Time take (hours), t <t 1 1<t 2 2<t 3 3<t 4 4<t 6 Number of pupils, f 14 17 5 1 3 Time take (hours), t <t 1 1<t 2 2<t 3 3<t 4 4<t 6 Mid iterval,.5 1.5 2.5 3.5 5 Number of pupils, f 14 17 5 1 3 f 7 25.5 12.5 3.5 15 f 2 88.2 38.25 31.25 12.25 75 = 7+25.5+12.5+3.5+15 = 63 = 1.575 14+17+5+1+3 4 S = (88.2+38.25+31.25+12.25+75) (4 X 1.575 2 ) = 2.4686 s = (2.4686/39) =.252 (3dp) If the data are coded as y = a +b the the mea ad stadard deviatio have the codig y = a + b (the same codig) ad s y = as (multiply by the multiplier of ) Eample For two sets of data ad y it is foud that they are related by the formula y = 5 : Give = 24.8 ad s = 7.3, fid the values of y ad s y y = (5 24.8) = 12 s y = 5 7.3 = 36.5

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 1 REVISION SHEET STATISTICS 1 (Ed) NORMAL DISTRIBUTION The mai ideas are: Properties of the Normal Distributio Mea, SD ad Var Before the eam you should kow: All of the properties of the Normal Distributio. How to use the relevat tables. How to calculate mea, stadard deviatio ad variace. Defiitio A cotiuous radom variable X which is bellshaped ad has mea (epectatio) μ ad stadard deviatio σ is said to follow a Normal Distributio with parameters μ ad σ. I shorthad, X ~ N(μ, σ 2 ) Eample 1 X ~ N(, 16) z = ; 4 fid (a) P(X < 66), (b) P(X 66), (c) P(55 X 63), (d) s.t. P(X > ) = 99% (a) P(X < 66) = P(Z < 1.5) =.9332 (b) P(X 66) = 1 P(X < 66) = 1.9332 =.668 This may be give i stadardised form by usig the trasformatio μ z = = σ z+ μ, where Z ~ N(, 1) σ Calculatig Probabilities The area to the left of the value z, represetig P(Z z), is deoted by Φ(z) ad is read from tables for z. Useful techiques for z : P(Z > z) = 1 P(Z z) P(Z > z) = P(Z z) P(Z < z) = 1 P(Z z) The iverse ormal tables may be used to fid z = Φ -1 (p) for p.5. For p <.5, use symmetry properties of the Normal distributio. 99.73% of values lie withi 3 s.d. of the mea Estimatig μ ad/or σ Use (simultaeous) equatios of the form: = σz + μ for matchig (, z) pairs where z is give or may be deduced from Φ -1 (p) for give value(s) of. (c) P(55 X 63) = P(-1.25 Z.75) = P(Z.75) P(Z < -1.25) = P(Z.75) P(Z > 1.25) = P(Z.75) [1 P(Z 1.25)] =.7734 [1.8944] =.6678 (d) P(Z > -2.326) =.99 from tables Sice z =, = 4z + 4 = + 4 (-2.326) = 5.7 (to 3 s.f.) Eample 2 For a certai type of apple, % have a mass greater tha 13g ad 3% have a mass less tha 11g. (a) Estimate μ ad σ. (b) Whe 5 apples are chose at radom, fid the probability that all five have a mass eceedig 115g (a) P(Z >.8416) =.2 (X = 13) P(Z < -.5244) =.3 (X = 11) 13 =.8416 σ + μ 11 =.5244 σ + μ Solvig equatios simultaeously gives: μ = 117.68, σ = 14.64 (b) X ~ N(117.68, 14.64 2 ) z = 117.68 ; 14.64 P(X > 115) 5 = P(Z > -.183) 5 =.5726 5 =.616 (to 3 s.f.) Ackowledgemet: Material o this page was origially created by Bob Fracis ad we ackowledge his permissio to reproduce it here.

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 2 REVISION SHEET STATISTICS 1 (Ed) PROBABILITY The mai ideas are: Measurig probability Estimatig probability Epectatio Combied probability Two trials Coditioal probability The eperimetal probability of a evet is = umber of successes umber of trials If the eperimet is repeated 1 times, the the epectatio (epected frequecy) is equal to P(A). Before the eam you should kow: The theoretical probability of a evet A is give by P(A) = (A) where A is the set of favourable outcomes (ξ) ad ξ is the set of all possible outcomes. The complemet of A is writte A' ad is the set of possible outcomes ot i set A. P(A') = 1 P(A) For ay two evets A ad B: P(A B) = P(A) + P(B) P(A B) [or P(A or B) = P(A) + P(B) P(A ad B)] Tree diagrams are a useful way of illustratig probabilities for both idepedet ad depedet evets. Coditioal Probability is the probability that evet B occurs if evet A has already happeed. It is give by P(A B) P(B A) = P(A) The sample space for a eperimet illustrates the set of all possible outcomes. Ay evet is a sub-set of the sample space. Probabilities ca be calculated from first priciples. Eample: If two fair dice are throw ad their scores added the sample space is + 1 2 3 4 5 6 1 2 3 4 5 6 7 2 3 4 5 6 7 8 3 4 5 6 7 8 9 4 5 6 7 8 9 1 5 6 7 8 9 1 11 6 7 8 9 1 11 12 If evet A is the total is 7 the P(A) = 6 36 = 1 6 If evet B is the total > 8 the P(B) = 1 36 = 18 5 If the dice are throw 1 times, the epectatio of evet B is 1 X P(B) = 1 X 18 5 = 27.7778 or 28 (to earest whole umber) More tha oe evet Evets are mutually eclusive if they caot happe at the same time so P(A ad B) = P(A B) = A A B B Eample: A ordiary pack of cards is shuffled ad a card chose at radom. Evet A (card chose is a picture card): P(A) = 12 52 Evet B (card chose is a heart ): P(B) = 13 52 Fid the probability that the card is a picture card ad a heart. P(A B) = 12 52 X 13 52 = 52 3 : Fid the probability that the card is a picture card or a heart. P(A B) = P(A) + P(B) P(A B) = 12 52 + 13 52 3 52 = 22 52 = 11 26 Additio rule for mutually eclusive evets: P(A or B) = P(A B) = P(A) + P(B) For o-mutually eclusive evets P(A B) = P(A) + P(B) P(A B)

the Further Mathematics etwork www.fmetwork.org.uk V 7 1 2 Tree Diagrams Remember to multiply probabilities alog the braches (ad) ad add probabilities at the eds of braches (or) Idepedet evets P(A ad B) = P(A B) = P(A) P(B) Eample 1: A food maufacturer is givig away toy cars ad plaes i packets of cereals. The ratio of cars to plaes is 9:1 ad 25% of toys are red. Joe would like a car that is ot red. Costruct a tree diagram ad use it to calculate the probability that Joe gets what he wats. Eample 2: depedet evets A pack of cards is shuffled; Liz picks two cards at radom without replacemet. Fid the probability that both of her cards are picture cards Aswer: Evet A (the toy is a car): P(A) =.9 Evet B (the toy is ot red): P(B) =.75 The probability of Joe gettig a car that is ot red is.675 Aswer: Evet A (1st card is a picture card) Evet B (2d card is a picture card) The probability of choosig two picture cards is 11 221 Coditioal probability If A ad B are idepedet evets the the probability that evet B occurs is ot affected by whether or ot evet A has already happeed. This ca be see i eample 1 above. For idepedet evets P(B/A) = P(B) If A ad B are depedet, as i eample 2 above, the P(B/A) = P( A B ) P( A) so that probability of Liz pickig a picture card o the secod draw card give that she has already picked oe picture card is give by P(B/A) = P( ) 11 A B 221 = = 11 3 51 P( A) 13 The multiplicatio law for depedet probabilities may be rearraged to give P(A ad B) = P(A B) = P(A) P(B A) Eample: A survey i a particular tow shows that 35% of the houses are detached, 45% are semi-detached ad % are terraced. 3% of the detached ad semi-detached properties are reted, whilst 45% of the terraced houses are reted. A property is chose at radom. (i) Fid the probability that the property is reted (ii) Give that the property is reted, calculate the probability that it is a terraced house. Aswer Let A be the evet (the property is reted) Let B be the evet (the property is terraced) (i) P(reted) = (.35 X.3) + (.45 X.3) + (.2 X.45) =.33 The probability that a house is detached ad reted The probability that a house is semi-detached ad reted The probability that a house is terraced ad reted (ii) P(A) =.33 from part (i) P(B/A) = P( A B ) = (.2.45) P( A) (.33) =.27 (2 decimal places)