Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios to obtai scatter diagram. Note (1): Studets are required to recogize ad idetify the scatter patters that display positive, egative ad/or zero correlatio, which ca also be deduced by cosiderig the value of product momet correlatio coefficiet. ) Product Momet Correlatio Coefficiet i.e. r The value of r ca be obtaied easily usig GC if all data poits are provided i a table. Studets are advised to refer to lecture otes for the GC operatios to obtai r. For data provided oly i the summatio form, studets must use the formula provided i MF15 to evaluate r: i.e. r = ( x x)( y y) = [ ( x x) ][ ( y y) ] ( x) ( y) x xy x y y where x ad y are the sample mea of x ad y respectively e.g. Special Example o r: x x =. Certai questios may ot provide the data i the tabular or summatio forms. They may simply provide the two least square regressio equatios. For istace, 1 010 Mr Teo www.teachmejcmath-sg.webs.com
Least square regressio lie y o x is give as y = 0.x + 3 Least square regressio lie x o y is give as x = 4y + 5 Here, r ca be obtaied by usig the formula, r = bd, where b = 0. ad d = 4. Hece, r = 0.894 for this example. Note (): The values of b ad d are either both positive or both egative. Ca you tell why? Note (3): The value of r is betwee -1 ad 1 iclusive i.e. -1 r 1. Studets should kow how to describe the relatioship betwee the data based o the value of r. For example, if r is 0.9, the the data ca be described as havig a high positive liear correlatio. Also, r is a measure of spread ad it is idepedet of the uits ivolved i the data. This implies that if the uits of the origial data collected is chaged (e.g. from miutes to secods), the value of r is ot affected. Note (4): The value of r is uique to the sample data that defies it. If ew set of data is icluded ito the origial sample, the value of r will be differet. The revised r value ca be easily obtaied usig GC. However, if the ew data takes the values of x ad y of the origial sample, the value of r will ot chage. The prove for this pheomeo ca be rather tedious, hece studets are advised to just remember this as a fact. Similarly, it ca also be prove that the least square regressio equatios will also ot be affected by the itroductio of this ew set of data (i.e. x ad y ). 010 Mr Teo www.teachmejcmath-sg.webs.com
To verify that everythig i Note (4) is true, studets are welcomed to try this simple exercise. First by usig the data below, determie the values of x, y, r, equatios y o x ad x o y usig your GC. x 10 1 14 17 16 0 y.3.8 3.1 3..9 5.0 4.0 Next, usig x ad y as the 8 th data, fid the revised values of r ad the equatios y o x ad x o y. I believe you will get the same set of values as with the origial set of seve data poits, otherwise, please drop me a email! 3) Equatios of Least Square Regressio Lies (LSRL) Y o X: X o Y: y = a + bx x = c + dy The steps required to produce these regressio lies usig GC should be well documeted i your lecture otes, please refer to them if required. As discussed earlier, the product momet correlatio coefficiet ca be obtaied by usig r = bd. Note (5): The LSRLs ca be plotted oto the scatter diagram. Studets are advised to refer to lecture otes for the GC operatios to do this. 3 010 Mr Teo www.teachmejcmath-sg.webs.com
Note (6): i) The costats a, b, c ad d ca be obtaied usig GC if the data is provided i a tabular form. For data provided i summatio form, studets must apply the formula available i MF15: i.e. b = ( x x)( y y) ( x x) Aother formula, which is ot provided i MF15, worth rememberig is: = x y xy b ( x) x The costat a ca be obtaied by substitutig the values x ad y ito the least square regressio equatio. ii) The two regressio lies y o x ad x o y itersect at a poit which is the sample mea of x ad y i.e. x ad y. This poit is therefore very useful if either of the equatios cotai ukows. It has aother useful applicatio which is covered i poit Note (4). iii) The two least square regressio equatios are used to predict the values of a particular ukow as stated i the questio. For example, if you are predictig the value of y (i.e. you are give the value of x), you must use the least square regressio lie ivolvig y o x. O the other had, for predictig the x value, you should the be usig x o y. iv) The value beig predicted is valid (or accurate) if ad oly if the correspodig kow value is withi its ow rage (i.e. iterpolatio). For example, imagie you are predictig the y value for a rage of x values betwee 1 ad 10. If the correspodig x value is 7 (i.e. withi 4 010 Mr Teo www.teachmejcmath-sg.webs.com
the rage), the y value predicted usig regressio equatio y o x is valid. If the x value used is 11 istead (i.e. outside the rage or extrapolatio), the value of y predicted will ot be valid. Note (7): For a correlatio coefficiet (i.e. r) very close to 1 or -1, it does ot matter which regressio equatios you use to predict a particular variable. However, studets are advised to use the steps discussed i Note (6) (iv) as the mai approach. Of course, for Note (6) (iv) to work, the data must at least display a positive or egative liear correlatio. 4) No Liear Data (For H studets oly) Certai data provides a scatter diagram that shows o or lack of liear relatioship betwee the variables (e.g r is close to 0 or very far from ±1) but the scatter diagram takes a patter that resembles a curve e.g. U or N shape quadratic curve. Studets should recogize the shape of the curve ad select a appropriate equatio that ca be used to describe the data. For example, if the data looks like the plot o the left below, which resembles a sketch of y = lx (diagram o the right), the studets should cosider the expressio y = lx to obtai a better correlatio betwee the variables. 5 010 Mr Teo www.teachmejcmath-sg.webs.com
y = lx The more commoly used equatios are logarithmic, expoetial ad quadratic equatios. Note (8): Studets should recall the chapter of Liear Law i Additioal Mathematics as that forms the basics of this sectio. Other Commets: This is a fairly straightforward chapter which accouts for at least 8 marks ad most of the solutios ca be obtaied usig the GC. As such I advised all studets to remember every GC operatios covered i this chapter. For H studets, you may beefit by rememberig the poits covered uder Notes (4) ad (7) sice more is expected from this group of studets (i.e. deeper uderstadig or I would like to call it appreciatio of the chapter). 6 010 Mr Teo www.teachmejcmath-sg.webs.com
Fial Checklist By the ed of the topic, studets must be able to perform the followig to secure maximum poits i exam: Sketchig of scatter diagram ad usig it to commet o the relatioship betwee the variables (i.e. positive/egative liear correlatio). Whe required, idetify suspected data poit(s) to be excluded from the data set. Evaluate the value of the product momet correlatio coefficiet, r, usig GC or formula. Commet o the data based o the value of r (e.g. positive liear) Fidig the least square regressio lies of y o x ad/or x o y. Appreciate that the itersectio betwee the two regressio lies is ( x, y ). Usig the least square regressio lies to determie/approximate values (i.e. fid y give x). Idetify cases of extrapolatio ad/or iterpolatio where ecessary. Applicatio of a square, reciprocal or logarithmic trasformatio to achieve liearity for o-liear relatioships (i.e. r is very far from ±1) betwee variables (H studets oly). 7 010 Mr Teo www.teachmejcmath-sg.webs.com