Dealing with Data and Fitting Empirically

Size: px
Start display at page:

Download "Dealing with Data and Fitting Empirically"

Transcription

1 Fittig Fuctios to Data: Regressio Dealig with Data ad Fittig Empirically Notes by Holly Hirst Whe workig with a situatio for which we do t have a physical law (ad hece a equatio, fuctio or iequality), we observe the system, recordig data from our observatios. We are the left with the task of makig predictios ad drawig coclusios from the data. To do this we start by determiig the tred of the data, ad the fittig a appropriate fuctio. If the data from the observatios are ordered pairs, we ca look at a graph of the data to determie the tred, ad hece the form of the fuctio to fit. Commo practice puts the variable we are selectig to measure (iput) o the x axis statisticias call it the predictor ad the variable we are hopig to predict (output) o the y axis the respose. I the graph below o the left, the tred appears to be roughly liear with a dowward slope. We ca fit the lie by eye as i the graph o the right, simply drawig the lie so that it looks like o oe data poit is too far from the lie. If a equatio is eeded, we could choose two poits o the lie ad use simple algebra to fid the equatio. While fittig by eye is appealig because it is so simple, a slightly more mathematical approach yields a lie for which we ca quatify the fact that o poit is too far from the lie. Let the formula for the lie we are lookig for be y = Ax+B, where A is the ukow slope ad B is the ukow y- itercept. Look at the graph i the picture below. The actual data poit value for x 1 is y 1 ad the respose value for the predictor x 1 from the lie is Ax 1 +B. We ll call the differece betwee these values the deviatio. Oe way to fit a lie to the data is to miimize the sum of these deviatios, miimize " Ax i + B! y i, Data Fittig Notes 1

2 where we have assumed that there are +1 data poits umbered through. Also ote that we have used the absolute values of the deviatios. We do this so that deviatios for poits above the lie do t cacel out deviatios for poits below the lie. So what ext? To miimize this expressio, we eed to take the partial derivatives with respect to A ad B (the ukows), set the derivatives equal to zero, ad the solve the resultig equatios simultaeously. Ufortuately, the absolute values are difficult to deal with whe takig derivatives, sice absolute values have discotiuous derivatives. Note, however, that a liear programmig approach ca be take to get a Chebychev lie of best fit. We will ot pursue that here. How do we fix this? Istead of miimizig the sum of the deviatios, we will miimize the sum of the squares of the deviatios: miimize "( Ax i + B! y i ) The squares of the deviatios are always o-egative which is why we used the absolute values but avoid the problems with derivatives. I additio, miimizer for this sum is equal to the miimizer for the square root of this sum ad the square root of the sum is closely related to Euclidea distace. To fiish this, we eed to take the two partials ad set them equal to zero:! ( Ax i + B " y i ) # = " ( Ax i + B! y i )( x i )="( Ax i + Bx i! x i y i ) =!A (1)! ( Ax i + B " y i ) # = " ( Ax i + B! y i )( 1)=" ( Ax i + B! y i ) =!B () Yuk! Now what? We eed to simplify these two equatios ad the solve them for A ad B. To do this we eed to use the fact that we ca rearrage sums of sums as follows:!( T i + R i + S i ) = T i +!! R i +! S i The left side of this equatio idicates to add the i th T, R ad S values first ad the add those sums; the right side says to add the T s, add the R s, add the S s ad the sum those sums. Either way, we have added all the T s, R s ad S s together. Usig this fact, we ca simplify (1) to get: Ax i! +! Bx i "! x i y i = A! x i + B! x i "! x i y i = (3) Where did the factor of go? We divided both sides of the equatio by. Similarly, () simplifies to:! Ax i +! B "! y i = A! x i + B "! y i = (4) We are fially ready to solve these two equatios for A ad B. We will rewrite these without the subscripts ad limits for simplicity as: A! x + B! x =! xy ad A! x + B =! y Data Fittig Notes

3 Multiplyig the first oe by ad multiplyig secod oe by Solvig for A gives: ( ) = xy A! x "! x! x! x ad the subtractig gives:! "! x! y (5) A =! x i y i "! x i! y i #! x i " % &! x $ i ' i= Pluggig this expressio for A ito (4) gives a formula for B: B =! y " A i! x i i =1 i =1 Here is a example doe usig these formulas. We have used a table to orgaize the calculatios of all of the sums. (FYI: These data are from a physics experimet o sprigs.) X Y X^ XY sums: A = ( 6*94.5 5*59.3) / (6*145 5*5) = B = ( *5) / 6 = So the lie we might use to predict y usig a x value is: y = 1.61 x Goodess of Fit So how ca we tell the quality of the fit? First we eed to look at the data ad a graph showig the lie ad the data, askig whether the coditios eeded for liear regressio hold. If we are satisfied with the fit, the we ca look at a statistical measure of the stregth of the fit called the coefficiet of determiatio. As usual, let the actual data poits be (x i, y i ), i = 1,...,. Also let the predicted value for x i from the fitted lie be y ˆ i. So we have y ˆ i = Ax i + B. Coditios for Regressio The assumptios we eed for regressio iformatio to make sese (be ubiased estimates of coefficiets ad give reliable estimates of variace) are: 1. The x values are fixed, ot radom. Data Fittig Notes 3

4 . The y data values are idepedet of each other ad ormally distributed. This is usually the case if the data were collected carefully. Oe otable exceptio is values that are serially collected for which the idepedet variable is ot time. These time series data may deped o time ad hece o each other. 3. The idividual residuals ( y ˆ i y i ) are idepedet of each other ad ormally distributed with mea. How do we recogize whe these assumptios are violated? 1. The sigle best way to tell if the regressio is good is: LOOK AT THE GRAPH! Look for the followig thigs: Does the lie go through the middle of the data? o = bad (might be iflueced by a outlier) Is there a patter to the deviatios? yes = bad (might be o-liear) Is the shape of the data rectagular (rather tha wedge shaped)? o = bad (might eed a trasformatio of the y values or a weighted regressio).. Oe ca also look at some other graphs: Plot the y values o a histogram to look for ormality Plot the residuals ( y ˆ i y i ) to look for ormality with mea =. Plot residuals agaist fitted values (Y = ( y ˆ i y i ) versus X = y ˆ i ). This should be a horizotal bad across the graph. If there s a (sloped) liear patter there might be a uderlyig idepedet variable that should replace the x that was used. If there is a wedge shape, a trasformatio of y or weighted regressio might be eeded. Coefficiet of Determiatio Oce we are sure that oe of the coditios above are violated, we ca look at a value called the coefficiet of determiatio, aka R-Squared. This will give us a statistical measure of the quality of the fit. What did we do to fid the fits? We solved the problem or, usig the hat otatio: miimize miimize " ( ) " Ax i + b! y i ( ) ˆ y i! y i If we wat to fid out how well the lie captures the relatioship i the data, we ca examie the followig quatities that measure variace: Data Fittig Notes 4

5 SSTO (total sum of squares): " ( y i! y ), which measures the overall deviatio from the mea ( y ) of the respose variables without regard to the predictor variable. If this were, the there would be o eed to kow ay x values, i.e., o depedece o the x values. SSE (error sum of squares): " ( y i! y ˆ ) i, which measures the deviatio of the lie values from the observed values for the respose, i.e., takig ito cosideratio the predictor variable. If this were, the we would kow that the relatioship was exactly liear, i.e., all of the data poits are o the regressio lie. SSR (regressio sum of squares): " ( y ˆ i! y ), which measures the overall deviatio from the mea ( y ) of the fitted variables. If this were, the there would be o eed to kow ay x values, i.e., o depedece o the x values ad the regressio lie would be horizotal. Usig basic ideas from liear algebra, we ca show that " ( y i! y ) = " ( y i! y ˆ ) i + " ( y ˆ i! y ), (1) i.e., SSTO = SSE + SSR. This equality is the fudametal reaso for choosig to miimize the sum of the squared deviatios. I words, this equality says the followig: The overall variatio i the data is equal to the sum of the overall deviatio of the data from the fitted lie plus the deviatio of the lie values from the mea. The coefficiet of determiatio is defied as the proportio of the error ot attributable to the lie SSTO! SSE (SSTO-SSE) out of the total error:, ad thus calculated as SSTO " ( y! y ˆ i i ) 1! " ( y i!y ) or, i words, this is the proportio of the total variatio that ca be accouted for by the lie fit. It ca also be calculated as: " ( y ˆ i! y ) " ( y i! y ) That these two values are the same ca be show easily from the fudametal relatioship (1). Also from (1) we kow that this umber is always betwee ad 1, ad the closer to 1 it is, the Data Fittig Notes 5

6 more the lie "explais" the overall variatio i the data. I fact, the coefficiet of determiatio ca be thought of as the %-goodess of fit of the lie. We ca calculate this by had, but luckily for us, may software packages ca calculate this automatically whe the tredlie calculatio is doe y = 1.161x R = Series1 Liear (Series1) Notice that the lie is a pretty good fit (.9919 is very close to 1). We eed to use cautio whe lookig at the coefficiet of determiatio aka R-Squared. Whe fittig a lie, we ca iterpret the umber as a percet goodess of fit. So for the lie fitted above, 99% of the variatio i the data is explaied by the liear relatioship y = 1.16 x 5.4. What should t we say? Noe of the followig statemets are true! The lie will predict the y value 99% of the time. The y value predicted by the lie will be 99% of the actual value. No-Liear Fits Cosider the example i Chapter 5 of the Edward s ad Hamso s Guide to Modellig data collected to see if there is a relatioship betwee height i meters ad weight i kilograms. (Table 6.1 o page 13) x(height) y(weight) Data Fittig Notes 6

7 These data, whe plotted, give the followig graph: Height versus Weight weight (kg) height (m) These data seem a little curved, but let s fit a lie to it ad take a look: Height versus Weight y = x weight (kg) height (m) The liear fit is t bad, but otice that there is a patter to the deviatios below 1 meter ad above 1.75 meters, the data poits are above the lie; i betwee 1 ad 1.75 meters, the data poits lie below the lie. As metioed i the previous sectio, patters to the deviatios are idicative of a o-liear fit. So what o-liear fuctio should we try to fit? If we take the approach from MAT 595, we would thik about what the data represet: Height versus weight i humas. Let s make some assumptios: Humas are geometrically similar Weight is proportioal to volume Volume is a 3-D measuremet whereas height is 1-D. From these assumptios, it seems reasoable to try weight! height 3. So let s repeat the steps we used to fid the formula for liear regressio to fid a formula for simple cubic regressio. We start with miimizig the sum of the squared deviatios: miimize " ( Ax 3 i! y i ) Next we take the derivative with respect to the variable (A) ad set the result equal to zero. d da "( Ax 3 i! y i ) = Ax 3 3 " ( i! y i ) x i ( ) = A x i 6 i= Data Fittig Notes 7 "! " x 3 i y i = (1)

8 Solvig this for A gives: Orgaizig the calculatios i table form: A =!! i= x i 3 y i x i 6 Buildig a ew table with x, y ad A*x^3 gives: x(height) y(weight) x^6 x^3*y sum: A= Plottig this: x(height) y(weight) A*x^ Data Fittig Notes 8

9 data with cubic fit w e i g h t h e i g h t y(weight) A*x^3 Note from the graph that this fit appears to be bad at the bottom. Why might our assumptios have let us dow? The lower values i our data set come from people uder 1 meter tall amely childre. Are childre geometrically similar to adults? Not really. Let s try some other models. I the last attempt we chose the power o the height to be 3. What if we let that be ukow as well lettig the data drive the power empirically? miimize " ( Ax B i! y i ) Next we take the derivative with respect to the variables (A ad B) ad set the results equal to zero. Here is the partial derivative with respect to A.! #( Ax B!A i " y i ) = Ax B B " ( i! y i ) x i ( ) = A x i B Data Fittig Notes 9 "! " x B i y i = (1) Notice that the resultig equatio is ot goig to be liear (B is i the expoet!), leadig to a system that is ot easy to hadle. Is there aother way? Yes! We ca use a log trick to get the B out of the expoet: y = Ax B! l( y) = l( Ax B )! l( y) = l( A) + l( x B )! l( y) = l( A) + Bl( x) This says that Y = l(y) ad X = l(x) have a liear relatioship wheever y ad x have a power relatioship, with the slope equal to the power, B, ad the y-itercept equal to l(a) -- the log of the costat. So what should we do? Fit a lie betwee l(x) ad l(y)... x(height) y(weight) l(x) l(y)

10 Here is a lie fit to l(x) ad l(y): log-log plot l ( w e i g h t ) 5 y =.347x l ( h e i g h t ) This is ofte referred to as log-log regressio. This looks liear, ad the deviatios are more radomly placed. How do we recover the origial fuctio from this? We'll use the statemet we made two paragraphs up: Y = l(y) ad X = l(x) have a liear relatioship wheever y ad x have a power relatioship, with the slope equal to power, B, ad the y-itercept equal to l(a) -- the log of the costat. So.347 = B = power ad.86 = l(a) or exp(.86) = = costat, givig Here is the graph usig the power curve: y = 16.79x power fit usig excel y = x.347 weight (y) height (x) y(weight) Power (y(weight)) Oe last thig to try o this height-weight data set is expoetial regressio. A*exp(Bx) is aother fuctio that has a similar shape for x >. If we tried agai with the aive approach to regressio miimizig the sum of the squared deviatios, we would ru ito algebraic problems agai, just like with the power curve calculatios above. We ca trasform this i a similar way: Data Fittig Notes 1

11 y = Ae Bx! l( y) = l( Ae Bx )! l( y) = l( A) + l( e Bx )! l y ( ) = l( A) + Bxl( e) ( ) = l( A) + Bx! l y This says that Y = l(y) ad X = x have a liear relatioship wheever y ad x have a expoetial relatioship, with the slope equal to the power, B, ad the y-itercept equal to l(a) -- the log of the costat. So what should we do? Fit a lie betwee x ad l(y)... Here is a graph of the liearized data: x(height) y(weight) x(height) l(y) l(y) -- l(weight) log plot y = x l(y).5 Liear (l(y)) x -- height This is ofte referred to as log regressio. This looks liear, ad as i the power fit the deviatios are more radomly placed. How do we recover the origial fuctio from this? We'll use the statemet we made two paragraphs up: Y = l(y) ad X = x have a liear relatioship wheever y ad x have a expoetial relatioship, with the slope equal to power, B, ad the y-itercept equal to l(a) -- the log of the costat. So = B = power ad.913 = l(a) or exp(.913) =.513 = costat, givig y =.513e 1.856x Data Fittig Notes 11

12 Here is the graph usig the power curve ad the expoetial curve ad showig the fuctios together: height versus weight weight -- y height -- x y(weight) Power (y(weight)) Expo. (y(weight)) y =.515e x y = x.347 Checkig for Goodess of Fit i the No-Liear Case Sice these fits are based upo a liearizatio of the origial data, the same rules apply whe decidig if the fit is good. Look at the trasformed data ad check: 1. The x values are fixed, ot radom.. The y data values are idepedet of each other ad ormally distributed. This is usually the case if the data were collected carefully. Oe otable exceptio is values that are serially collected for which the idepedet variable is ot time. These time series data may deped o time ad hece o each other. 3. The residuals are idepedet of each other ad ormally distributed with mea. How do we recogize whe these assumptios are violated? The sigle best way to tell if the regressio is good is LOOK AT THE GRAPH! Look for the followig thigs both with the origial ad with the trasformed data: Does the curve go through the middle of the data? o = bad Is there a patter to the deviatios? yes = bad Is the shape of the data rectagular (rather tha wedge shaped)? I the case of the origial data, look for a rectagle that is bet to follow the curve. o = bad Coefficiet of Determiatio as used with No-Liear Fits Note that i each case we could get a R-squared value. Whe fittig o-liear curves to the data, i particular the power ad expoetial fits we have see already, cautio must be used whe lookig at this umber. It tells us how good the fit is for the trasformed data NOT the origial data. I particular, logarithmic trasformatios chage the distaces betwee the y data values ad the mea ( y ) i a o-uiform way. This implies that the traditioal explaatio for the meaig of the coefficiet of determiatio does t work if the data have bee trasformed i this way. We ca still look at the coefficiet of determiatio i a comparative way to choose betwee two curves of the same kid, such as two differet expoetials, two power fits, etc., but ot as a way to choose betwee the fits from differet kids of curves. This mis-use is commo i applicatios of statistics to the social scieces remember to avoid this!!! Data Fittig Notes 1

13 So, bottom lie, what is the best way to tell if we have chose the best curve? Use your eyes! Look at the data with the various fits ad choose the oe that appears to come closest to all of the poits ad gives a radom patter i the deviatios. Problems 1. The followig data table cotais a sample of 4 idividuals out of over 7 who participated i a study i Hoolulu, HI i Aswer each questio, ad explai your coclusios. Is there a lik betwee weight ad cholesterol levels? Please ote: Lower total cholesterol is cosidered healthier. Does age matter? Does physical activity matter? Does smokig affect blood pressure? Some experts propose that comparig weight aloe to blood pressure is ot as good as comparig weight per cm of height to blood pressure, i.e., create a ew variable that is weight divided by height for each perso. Ca this ew variable be used to predict blood pressure? Educatio (Highest completed) Hoolulu Study Data (197) Systolic Blood Pressure Weight (kg) Height (cm) Age Smoker Physical Activity Cholesterol primary y moderate oe heavy oe y moderate 7 19 primary y moderate primary heavy high school moderate oe heavy itermediate y moderate college heavy primary moderate high school y heavy oe moderate 9 15 oe heavy primary moderate itermediate heavy high school heavy itermediate y moderate college heavy oe moderate high school moderate oe y moderate itermediate y moderate primary y moderate primary moderate itermediate y heavy itermediate moderate primary y heavy 11 1 high school moderate 1 98 primary moderate itermediate moderate high school heavy oe y heavy primary moderate oe heavy primary moderate itermediate heavy primary heavy college moderate oe y heavy oe y moderate Wild black bears were aesthetized, ad their bodies were measured ad weighed. Oe goal of the study was to look at whether the distributio of the weights is differet for male ad female bears. Data Fittig Notes 13

14 The secod goal of the study was to determie the kid of relatioship that explaied best the associatio betwee legth ad weight. The third goal of the study was to fid a liear equatio for weight i terms of some other measurable characteristic for forest ragers, so they could estimate the weight of a bear based o that oe measuremet. This would be useful because i the field it is easier to measure a legth tha it is to weigh a bear with a scale. Use the dataset to ivestigate these goals. lik to data: 3. I order to avoid the expese ad icoveiece of usig a water tak to determie body fat, we wish to fid a measuremet that ca predict body fat. The data set provided below cosists of estimates of body fat determied by uderwater weighig alog with various body circumferece measuremets for 5 me. Which of these measuremets is the best predictor of body fat? lik to data: 4. Is there a relatioship betwee brai ad body weight i mammals? Use the data provided for 53 species average adult male body weight to ivestigate this questio. lik to data: 5. World Rakigs: Data Source: This problem is desiged to let you grub aroud with real data. Real data are ot always ice, so do t be surprised if you ru ito less tha satisfyig results. Your assigmet is to try various fits ad use the ideas from class to choose the best relatioship or to say why you thik there is o relatioship. Some examples of relatioships you might ivestigate: Explore the relatioship betwee life expectacy ad GDP. Explore the relatioship betwee eergy use ad GDP. Explore the relatioship betwee literacy ad ifat mortality rate. Explore the relatioship betwee CO emissio ad GDP. Explore the relatioship betwee populatio ad eergy use. Explore the relatioship betwee literacy ad life expectacy. Explore the relatioship betwee CO emissios ad eergy use. Explore the relatioship betwee fertility rate ad % over 6. Data Fittig Notes 14

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor. Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

University of California, Los Angeles Department of Statistics. Simple regression analysis

University of California, Los Angeles Department of Statistics. Simple regression analysis Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100C Istructor: Nicolas Christou Simple regressio aalysis Itroductio: Regressio aalysis is a statistical method aimig at discoverig

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

SNAP Centre Workshop. Basic Algebraic Manipulation

SNAP Centre Workshop. Basic Algebraic Manipulation SNAP Cetre Workshop Basic Algebraic Maipulatio 8 Simplifyig Algebraic Expressios Whe a expressio is writte i the most compact maer possible, it is cosidered to be simplified. Not Simplified: x(x + 4x)

More information

Regression and Correlation

Regression and Correlation 43 Cotets Regressio ad Correlatio 43.1 Regressio 43. Correlatio 17 Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700 Simple Regressio CS 7 Ackowledgemet These slides are based o presetatios created ad copyrighted by Prof. Daiel Measce (GMU) Basics Purpose of regressio aalysis: predict the value of a depedet or respose

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

September 2012 C1 Note. C1 Notes (Edexcel) Copyright   - For AS, A2 notes and IGCSE / GCSE worksheets 1 September 0 s (Edecel) Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright www.pgmaths.co.uk - For AS, A otes ad IGCSE / GCSE worksheets September 0 Copyright

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

a is some real number (called the coefficient) other

a is some real number (called the coefficient) other Precalculus Notes for Sectio.1 Liear/Quadratic Fuctios ad Modelig http://www.schooltube.com/video/77e0a939a3344194bb4f Defiitios: A moomial is a term of the form tha zero ad is a oegative iteger. a where

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polyomial Fuctios ad Their Graphs I this sectio we begi the study of fuctios defied by polyomial expressios. Polyomial ad ratioal fuctios are the most commo fuctios used to model data, ad are used extesively

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1 PH 425 Quatum Measuremet ad Spi Witer 23 SPIS Lab Measure the spi projectio S z alog the z-axis This is the experimet that is ready to go whe you start the program, as show below Each atom is measured

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Regression and correlation

Regression and correlation Cotets 43 Regressio ad correlatio 1. Regressio. Correlatio Learig outcomes You will lear how to explore relatioships betwee variables ad how to measure the stregth of such relatioships. You should ote

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day LECTURE # 8 Mea Deviatio, Stadard Deviatio ad Variace & Coefficiet of variatio Mea Deviatio Stadard Deviatio ad Variace Coefficiet of variatio First, we will discuss it for the case of raw data, ad the

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

STP 226 EXAMPLE EXAM #1

STP 226 EXAMPLE EXAM #1 STP 226 EXAMPLE EXAM #1 Istructor: Hoor Statemet: I have either give or received iformatio regardig this exam, ad I will ot do so util all exams have bee graded ad retured. PRINTED NAME: Siged Date: DIRECTIONS:

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials

3.2 Properties of Division 3.3 Zeros of Polynomials 3.4 Complex and Rational Zeros of Polynomials Math 60 www.timetodare.com 3. Properties of Divisio 3.3 Zeros of Polyomials 3.4 Complex ad Ratioal Zeros of Polyomials I these sectios we will study polyomials algebraically. Most of our work will be cocered

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains.

REVISION SHEET FP1 (MEI) ALGEBRA. Identities In mathematics, an identity is a statement which is true for all values of the variables it contains. The mai ideas are: Idetities REVISION SHEET FP (MEI) ALGEBRA Before the exam you should kow: If a expressio is a idetity the it is true for all values of the variable it cotais The relatioships betwee

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled 1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how

More information

CS322: Network Analysis. Problem Set 2 - Fall 2009

CS322: Network Analysis. Problem Set 2 - Fall 2009 Due October 9 009 i class CS3: Network Aalysis Problem Set - Fall 009 If you have ay questios regardig the problems set, sed a email to the course assistats: simlac@staford.edu ad peleato@staford.edu.

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

AP Statistics Review Ch. 8

AP Statistics Review Ch. 8 AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other

Linear Regression Analysis. Analysis of paired data and using a given value of one variable to predict the value of the other Liear Regressio Aalysis Aalysis of paired data ad usig a give value of oe variable to predict the value of the other 5 5 15 15 1 1 5 5 1 3 4 5 6 7 8 1 3 4 5 6 7 8 Liear Regressio Aalysis E: The chirp rate

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

Chapter two: Hypothesis testing

Chapter two: Hypothesis testing : Hypothesis testig - Some basic cocepts: - Data: The raw material of statistics is data. For our purposes we may defie data as umbers. The two kids of umbers that we use i statistics are umbers that result

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43

INSTRUCTIONS (A) 1.22 (B) 0.74 (C) 4.93 (D) 1.18 (E) 2.43 PAPER NO.: 444, 445 PAGE NO.: Page 1 of 1 INSTRUCTIONS I. You have bee provided with: a) the examiatio paper i two parts (PART A ad PART B), b) a multiple choice aswer sheet (for PART A), c) selected formulae

More information

Introducing Sample Proportions

Introducing Sample Proportions Itroducig Sample Proportios Probability ad statistics Aswers & Notes TI-Nspire Ivestigatio Studet 60 mi 7 8 9 0 Itroductio A 00 survey of attitudes to climate chage, coducted i Australia by the CSIRO,

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y). Chapters 5 ad 13: REGREION AND CORRELATION (ectios 5.5 ad 13.5 are omitted) Uivariate data: x, Bivariate data (x,y). Example: x: umber of years studets studied paish y: score o a proficiecy test For each

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions Faculty of Egieerig MCT242: Electroic Istrumetatio Lecture 2: Istrumetatio Defiitios Overview Measuremet Error Accuracy Precisio ad Mea Resolutio Mea Variace ad Stadard deviatio Fiesse Sesitivity Rage

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

CURRICULUM INSPIRATIONS:  INNOVATIVE CURRICULUM ONLINE EXPERIENCES:  TANTON TIDBITS: CURRICULUM INSPIRATIONS: wwwmaaorg/ci MATH FOR AMERICA_DC: wwwmathforamericaorg/dc INNOVATIVE CURRICULUM ONLINE EXPERIENCES: wwwgdaymathcom TANTON TIDBITS: wwwjamestatocom TANTON S TAKE ON MEAN ad VARIATION

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

IP Reference guide for integer programming formulations.

IP Reference guide for integer programming formulations. IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Confidence Intervals for the Population Proportion p

Confidence Intervals for the Population Proportion p Cofidece Itervals for the Populatio Proportio p The cocept of cofidece itervals for the populatio proportio p is the same as the oe for, the samplig distributio of the mea, x. The structure is idetical:

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information