Lecture 4: Simple Linear Regression Models, with Hints at Their Estimation

Size: px
Start display at page:

Download "Lecture 4: Simple Linear Regression Models, with Hints at Their Estimation"

Transcription

1 Lecture 4: Simple Liear Regressio Models, with Hits at Their Estimatio 36-40, Fall 207, Sectio B The Simple Liear Regressio Model Let s recall the simple liear regressio model from last time. This is a statistical model with two variables X ad Y, where we try to predict Y from X. The assumptios of the model are as follows:. The distributio of X is arbitrary (ad perhaps X is eve o-radom). 2. If X = x, the Y = β 0 + β x + ɛ, for some costats ( coefficiets, parameters ) β 0 ad β, ad some radom oise variable ɛ. 3. E [ɛ X = x] = 0 (o matter what x is), Var [ɛ X = x] = σ 2 (o matter what x is). 4. ɛ is ucorrelated across observatios. To elaborate, with multiple data poits, (X, Y ), (X 2, Y 2 ),... (X, Y ), the the model says that, for each i :, Y i = β 0 + β X i + ɛ i () where the oise variables ɛ i all have the same expectatio (0) ad the same variace (σ 2 ), ad Cov [ɛ i, ɛ j ] = 0 (uless i = j, of course).. Plug-I Estimates I lecture, we saw that the optimal liear predictor of Y from X has slope β = Cov [X, Y ] /Var [X], ad itercept β 0 = E [Y ] β E [X]. A commo tactic i devisig estimators is to use what s sometimes called the plug-i priciple, where we fid equatios for the parameters which would hold if we kew the full distributio, ad plug i the sample versios of the populatio quatities. We saw this i the last lecture, where we estimated β by the ratio of the sample covariace to the sample variace: β = c XY (2)

2 We also saw, i the otes to the last lecture, that so log as the law of large umbers holds, β β (3) as. It follows easily that will also coverge o β 0..2 Least Squares Estimates β 0 = Y β X (4) A alterative way of estimatig the simple liear regressio model starts from the objective we are tryig to reach, rather tha from the formula for the slope. Recall, from lecture, that the true optimal slope ad itercept are the oes which miimize the mea squared error: (β 0, β ) = argmi E [ (Y (b 0 + b X)) 2] (5) (b 0,b ) This is a fuctio of the complete distributio, so we ca t get it from data, but we ca approximate it with data. The i-sample, empirical or traiig MSE is MSE(b 0, b ) (y i (b 0 + b x i )) 2 (6) Notice that this is a fuctio of b 0 ad b ; it is also, of course, a fuctio of the data, (x, y ), (x 2, y 2 ),... (x, y ), but we will geerally suppress that i our otatio. If our samples are all idepedet, for ay fixed (b 0, b ), the law of large umbers tells us that MSE(b 0, b ) MSE(b 0, b ) as. So it does t seem ureasoable to try miimizig the i-sample error, which we ca compute, as a proxy for miimizig the true MSE, which we ca t. Where does it lead us? Start by takig the derivatives with respect to the slope ad the itercept: MSE b 0 = MSE b 0 = (y i (b 0 + b x i ))( 2) (7) (y i (b 0 + b x i ))( 2x i ) (8) Set these to zero at the optimum ( ˆβ 0, ˆβ ): (y i ( 0 + ˆβ x i )) = 0 (9) (y i ( ˆβ 0 + ˆβ x i ))(x i ) = 0 2

3 These are ofte called the ormal equatios for least-squares estimatio, or the estimatig equatios: a system of two equatios i two ukows, whose solutio gives the estimate. May people would, at this poit, remove the factor of /, but I thik it makes it easier to uderstad the ext steps: The first equatio, re-writte, gives Substitutig this ito the remaiig equatio, y ˆβ 0 ˆβ x = 0 (0) xy ˆβ 0 x ˆβ x 2 = 0 () ˆβ 0 = y ˆβ x (2) 0 = xy ȳ x + ˆβ x x ˆβ x 2 (3) 0 = c XY ˆβ (4) ˆβ = c XY (5) That is, the least-squares estimate of the slope is our old fried the plug-i estimate of the slope, ad thus the least-squares itercept is also the plug-i itercept. Goig forward The equivalece betwee the plug-i estimator ad the leastsquares estimator is a bit of a special case for liear models. I some o-liear models, least squares is quite feasible (though the optimum ca oly be foud umerically, ot i closed form); i others, plug-i estimates are more useful tha optimizatio..3 Bias, Variace ad Stadard Error of Parameter Estimates Whether we thik of it as derivig from plugig-i or from least squares, we work out some of the properties of this estimator of the coefficiets, usig the model assumptios. We ll start with the slope, ˆβ. 3

4 ˆβ = c XY = = x iy i xȳ x i(β 0 + β x i + ɛ ) x(β 0 + β x + ɛ) = β 0 x + β x 2 + x iɛ i xβ 0 β x 2 x ɛ s 2 x = β + x iɛ i x ɛ = β + x iɛ i x ɛ (6) (7) (8) (9) (20) (2) Sice x ɛ = i xɛ i, ˆβ = β + (x i x)ɛ i (22) This represetatio of the slope estimate shows that it is equal to the true slope (β ) plus somethig which depeds o the oise terms (the ɛ i, ad their sample average ɛ). We ll use this to fid the expected value ad the variace of the estimator ˆβ. I the ext couple of paragraphs, I am goig to treat the x i as o-radom variables. This is appropriate i desiged or cotrolled experimets, where we get to chose their value. I radomized experimets or i observatioal studies, obviously the x i are t ecessarily fixed; [ however, these ] expressios will be correct for the coditioal expectatio E ˆβ x,... x ad coditioal variace Var ˆβ x,... x, ad I will come back to how we get the ucoditioal [ ] expectatio ad variace. Expected value ad bias Recall that E [ɛ i X i ] = 0, so (x i x)e [ɛ i ] = 0 (23) Thus, [ ] E ˆβ = β (24) Sice the bias of a estimator is the differece betwee its expected value ad the truth, ˆβ is a ubiased estimator of the optimal slope. 4

5 (To repeat what I m sure you remember from mathematical [ statistics: ] bias here is a techical term, meaig o more ad o less tha E ˆβ β. A ubiased estimator could still make systematic mistakes for istace, it could uderestimate 99% of the time, provided that the % of the time it over-estimates, it does so by much more tha it uder-estimates. Moreover, ubiased estimators are ot ecessarily superior to biased oes: the total error depeds o both the bias of the estimator ad its variace, ad there are may situatios where you ca remove lots of bias at the cost of addig a little variace. Least squares for simple liear regressio happes ot to be oe of them, but you should t expect that as a geeral rule.) Turig to the itercept, [ ] [ E ˆβ0 = E Y ˆβ ] X (25) [ ] = β 0 + β X E ˆβ X (26) so it, too, is ubiased. = β 0 + β X β X (27) = β 0 (28) Variace ad Stadard Error Usig the formula for the variace of a sum from lecture, ad the model assumptio that all the ɛ i are ucorrelated with each other, [ ] [ Var ˆβ = Var β + (x ] i x)ɛ i (29) [ = Var (x ] i x)ɛ i = = (30) 2 (x i x) 2 Var [ɛ i ] ( )2 (3) σ 2 s2 X ( )2 (32) = σ2 s 2 (33) X I words, this says that the variace of the slope estimate goes up as the oise aroud the regressio lie (σ 2 ) gets bigger, ad goes dow as we have more observatios (), which are further spread out alog the horizotal axis ( ); it should ot be surprisig that it s easier to work out the slope of a lie from may, well-separated poits o the lie tha from a few poits smushed together. The stadard error of a estimator is just its stadard deviatio, or the square root of its variace: se( ˆβ ) = σ (34) s 2 X 5

6 I will leave workig out the variace of ˆβ 0 as a exercise. Ucoditioal-o-X Properties The last few paragraphs, as I said, have looked at the expectatio ad variace of ˆβ coditioal o x,... x, either because the x s really are o-radom (e.g., cotrolled by us), or because we re just iterested i coditioal iferece. [ ] If we do care [ ] about ucoditioal [ properties, the we still eed to fid E ˆβ ad Var ˆβ, ot just E ˆβ x,... x ] [ ] ad Var ˆβ x,... x. Fortuately, this is easy, so log as the simple liear regressio model holds. To get the ucoditioal expectatio, we use the law of total expectatio : [ ] [ [ ]] E ˆβ = E E ˆβ X,... X (35) = E [β ] = β (36) That is, the estimator is ucoditioally ubiased. To get the ucoditioal variace, we use the law of total variace : [ ] [ [ ]] [ [ ]] Var ˆβ = E Var ˆβ X,... X + Var E ˆβ X,... X (37) [ ] σ 2 = E + Var [β ] (38) [ = σ2 E.4 Parameter Iterpretatio; Causality Two of the parameters are easy to iterpret. ] (39) σ 2 β 0 is the variace of the oise aroud the regressio lie; σ is a typical distace of a poit from the lie. ( Typical here i a special sese, it s the rootmea-squared distace, rather tha, say, the average absolute distace.) is the simply the expected value of Y whe X is 0, E [Y X = 0]. The poit X = 0 usually has o special sigificace, but this settig does esure that the lie goes through the poit (E [X], E [Y ]). The iterpretatio of the slope is both very straightforward ad very tricky. Mathematically, it s easy to covice yourself that, for ay x or, for ay x, x 2, β = E [Y X = x] E [Y X = x ] (40) β = E [Y X = x 2] E [Y X = x ] x 2 x (4) This is just sayig that the slope of a lie is rise/ru. 6

7 The tricky part is that we have a very strog, atural tedecy to iterpret this as tellig us somethig about causatio If we chage X by, the o average Y will chage by β. This iterpretatio is usually completely usupported by the aalysis. If I use a old-fashioed mercury thermometer, the height of mercury i the tube usually has a ice liear relatioship with the temperature of the room the thermometer is i. This liear relatioship goes both ways, so we could regress temperature (Y ) o mercury height (X). But if I maipulate the height of the mercury (say, by chagig the ambiet pressure, or shiig a laser ito the tube, etc.), chagig the height X will ot, i fact, chage the temperature outside. The right way to iterpret β is ot as the result of a chage, but as a expected differece. The correct catch-phrase would be somethig like If we select two sets of cases from the u-maipulated distributio where X differs by, we expect Y to differ by β. This covers the thermometer example, ad every other I ca thik of. It is, I admit, much more ielegat tha If X chages by, Y chages by β o average, but it has the advatage of beig true, which the other does ot. There are circumstaces where regressio ca be a useful part of causal iferece, but we will eed a lot more tools to grasp them; that will come towards the ed of The Gaussia-Noise Simple Liear Regressio Model We have, so far, assumed comparatively little about the oise term ɛ. The advatage of this is that our coclusios apply to lots of differet situatios; the drawback is that there s really ot all that much more to say about our estimator β or our predictios tha we ve already goe over. If we made more detailed assumptios about ɛ, we could make more precise ifereces. There are lots of forms of distributios for ɛ which we might cotemplate, ad which are compatible with the assumptios of the simple liear regressio model (Figure ). The oe which has become the most commo over the last two ceturies is to assume ɛ follows a Gaussia distributio. The result is the Gaussia-oise simple liear regressio model :. The distributio of X is arbitrary (ad perhaps X is eve o-radom). 2. If X = x, the Y = β 0 + β x + ɛ, for some costats ( coefficiets, parameters ) β 0 ad β, ad some radom oise variable ɛ. 3. ɛ N(0, σ 2 ), idepedet of X. Our textbook, rather old-fashioedly, calls this the ormal error model rather tha Gaussia oise. I dislike this: ormal is a over-loaded word i math, while Gaussia is (comparatively) specific; error made sese i Gauss s origial cotext of modelig, specifically, errors of observatio, but is misleadig geerally; ad callig Gaussia distributios ormal suggests they are much more commo tha they really are. 7

8 ε ε ε ε ε ε par(mfrow=c(2,3)) curve(dorm(x), from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) curve(exp(-abs(x))/2, from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) curve(sqrt(pmax(0,-x^2))/(pi/2), from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) curve(dt(x,3), from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) curve(dgamma(x+.5, shape=.5, scale=), from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) curve(0.5*dgamma(x+.5, shape=.5, scale=) + 0.5*dgamma(0.5-x, shape=0.5, scale=), from=-3, to=3, xlab=expressio(epsilo), ylab="", ylim=c(0,)) par(mfrow=c(,)) Figure : Some possible oise distributios for the simple liear regressio model, sice all have E [ɛ] = 0, ad could get ay variace by scalig. (The model is eve compatible with each observatio takig ɛ from a differet distributio.) From top left to bottom right: Gaussia; double-expoetial ( Laplacia ); circular distributio; t with 3 degrees of freedom; a gamma distributio (shape.5, scale ) shifted to have 8 mea 0; mixture of two gammas with shape.5 ad shape 0.5, each off-set to have expectatio 0. The first three were all used as error models i the 8th ad 9th ceturies. (See the source file for how to get the code below the figure.)

9 4. ɛ is idepedet across observatios. You will otice that these assumptios are strictly stroger tha those of the simple liear regressio model. More exactly, the first two assumptios are the same, while the third ad fourth assumptios of the Gaussia-oise model imply the correspodig assumptios of the other model. This meas that everythig we have doe so far directly applies to the Gaussia-oise model. O the other had, the stroger assumptios let us say more. They tell us, exactly, the probability distributio for Y give X, ad so will let us get exact distributios for predictios ad for other iferetial statistics. Why the Gaussia oise model? Why should we thik that the oise aroud the regressio lie would follow a Gaussia distributio, idepedet of X? There are two big reasos.. Cetral limit theorem The oise might be due to addig up the effects of lots of little radom causes, all early idepedet of each other ad of X, where each of the effects are of roughly similar magitude. The the cetral limit theorem will take over, ad the distributio of the sum of effects will ideed be pretty Gaussia. For Gauss s origial cotext, X was (simplifyig) Where is such-ad-such-a-plaet i space?, Y was Where does a astroomer record the plaet as appearig i the sky?, ad oise came from defects i the telescope, eye-twitches, atmospheric distortios, etc., etc., so this was pretty reasoable. It is clearly ot a uiversal truth of ature, however, or eve somethig we should expect to hold true as a geeral rule, as the ame ormal suggests. 2. Mathematical coveiece Assumig Gaussia oise lets us work out a very complete theory of iferece ad predictio for the model, with lots of closed-form aswers to questios like What is the optimal estimate of the variace? or What is the probability that we d see a fit this good from a lie with a o-zero itercept if the true lie goes through the origi?, etc., etc. Aswerig such questios without the Gaussia-oise assumptio eeds somewhat more advaced techiques, ad much more advaced computig; we ll get to it towards the ed of the class. 2. Visualizig the Gaussia Noise Model The Gaussia oise model gives us ot just a expected value for Y at each x, but a whole coditioal distributio for Y at each x. To visualize it, the, it s ot eough to just sketch a curve; we eed a three-dimesioal surface, showig, for each combiatio of x ad y, the probability desity of Y aroud that y give that x. Figure 2 illustrates. 2.2 Maximum Likelihood vs. Least Squares As you remember from your mathematical statistics class, the likelihood of a parameter value o a data set is the probability desity at the data uder 9

10 4 2 y x p(y x) x.seq <- seq(from=-, to=3, legth.out=50) y.seq <- seq(from=-5, to=5, legth.out=50) cod.pdf <- fuctio(x,y) { dorm(y, mea=0-5*x, sd=0.5) } z <- outer(x.seq, y.seq, cod.pdf) persp(x.seq,y.seq,z, ticktype="detailed", phi=75, xlab="x", ylab="y", zlab=expressio(p(y x)), cex.axis=0.8) Figure 2: Illustratig how the coditioal pdf of Y varies as a fuctio of X, for a hypothetical Gaussia oise simple liear regressio where β 0 = 0, β = 5, ad σ 2 = (0.5) 2. The perspective is adjusted so that we are lookig early straight dow from above o the surface. (Ca you fid a better viewig agle?) See help(persp) for the 3D plottig (especially the examples), ad help(outer) for the outer fuctio, which takes all combiatios of elemets from two vectors ad pushes them through a fuctio. How would you modify this so that the regressio lie wet through the origi with a slope of 4/3 ad a stadard deviatio of 5? 0

11 those parameters. We could ot work with the likelihood with the simple liear regressio model, because it did t specify eough about the distributio to let us calculate a desity. With the Gaussia-oise model, however, we ca write dow a likelihood 2 By the model s assumptios, if thik the parameters are the parameters are b 0, b, s 2 (reservig the Greek letters for their true values), the Y X = x N(b 0 + b x, s 2 ), ad Y i ad Y j are idepedet give X i ad X j, so the over-all likelihood is (y 2πs 2 e i (b 0 +b x i )) 2 2s 2 (42) As usual, we work with the log-likelihood, which gives us the same iformatio 3 but replaces products with sums: L(b 0, b, s 2 ) = 2 log 2π log s 2s 2 (y i (b 0 + b x i )) 2 (43) We recall from mathematical statistics that whe we ve got a likelihood fuctio, we geerally wat to maximize it. That is, we wat to fid the parameter values which make the data we observed as likely, as probable, as the model will allow. (This may ot be very likely; that s aother issue.) We recall from calculus that oe way to maximize is to take derivatives ad set them to zero. L b 0 = 2s 2 L b = 2s 2 2(y i (b 0 + b x i ))( ) (44) 2(y i (b 0 + b x i ))( x i ) (45) Notice that whe we set these derivatives to zero, all the multiplicative costats i particular, the prefactor of 2s go away. We are left with 2 y i ( β 0 + β x i ) = 0 (46) (y i ( β 0 + β x i ))x i = 0 (47) These are, up to a factor of /, exactly the equatios we got from the method of least squares (Eq. 9). That meas that the least squares solutio is the maximum likelihood estimate uder the Gaussia oise model; this is o coicidece 4. 2 Strictly speakig, this is a coditioal (o X) likelihood; but oly pedats use the adjective i this cotext. 3 Why is this? 4 It s o coicidece because, to put it somewhat aachroistically, what Gauss did was ask himself for what distributio of the oise would least squares maximize the likelihood?.

12 Now let s take the derivative with respect to s: L s = s + 2 2s 3 (y i (b 0 + b x i )) 2 (48) Settig this to 0 at the optimum, icludig multiplyig through by σ 3, we get σ 2 = (y i ( β 0 + β x i )) 2 (49) Notice that the right-had side is just the i-sample mea squared error. Other models Maximum likelihood estimates of the regressio curve coicide with least-squares estimates whe the oise aroud the curve is additive, Gaussia, of costat variace, ad both idepedet of X ad of other oise terms. These were all assumptios we used i settig up a log-likelihood which was, up to costats, proportioal to the (egative) mea-squared error. If ay of those assumptios fail, maximum likelihood ad least squares estimates ca diverge, though sometimes the MLE solves a geeralized least squares problem (as we ll see later i this course). Exercises To thik through, ot to had it.. Show that if E [ɛ X = x] = 0 for all x, the Cov [X, ɛ] = 0. Would this still be true if E [ɛ X = x] = a for some other costat a? 2. Fid the variace of ˆβ0. Hit: Do you eed to worry about covariace betwee Y ad ˆβ? 2

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture & 3: Pricipal Compoet Aalysis The text i black outlies high level ideas. The text i blue provides simple mathematical details to derive or get to the algorithm

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Chapter 18 Summary Sampling Distribution Models

Chapter 18 Summary Sampling Distribution Models Uit 5 Itroductio to Iferece Chapter 18 Summary Samplig Distributio Models What have we leared? Sample proportios ad meas will vary from sample to sample that s samplig error (samplig variability). Samplig

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates. 5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

MA Advanced Econometrics: Properties of Least Squares Estimators

MA Advanced Econometrics: Properties of Least Squares Estimators MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

Machine Learning for Data Science (CS 4786)

Machine Learning for Data Science (CS 4786) Machie Learig for Data Sciece CS 4786) Lecture 9: Pricipal Compoet Aalysis The text i black outlies mai ideas to retai from the lecture. The text i blue give a deeper uderstadig of how we derive or get

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

MEASURES OF DISPERSION (VARIABILITY)

MEASURES OF DISPERSION (VARIABILITY) POLI 300 Hadout #7 N. R. Miller MEASURES OF DISPERSION (VARIABILITY) While measures of cetral tedecy idicate what value of a variable is (i oe sese or other, e.g., mode, media, mea), average or cetral

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor. Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

CHAPTER I: Vector Spaces

CHAPTER I: Vector Spaces CHAPTER I: Vector Spaces Sectio 1: Itroductio ad Examples This first chapter is largely a review of topics you probably saw i your liear algebra course. So why cover it? (1) Not everyoe remembers everythig

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS

AAEC/ECON 5126 FINAL EXAM: SOLUTIONS AAEC/ECON 5126 FINAL EXAM: SOLUTIONS SPRING 2015 / INSTRUCTOR: KLAUS MOELTNER This exam is ope-book, ope-otes, but please work strictly o your ow. Please make sure your ame is o every sheet you re hadig

More information

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY

ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY (1) A distributio that allows asymmetry differet probabilities for egative ad positive outliers is the asymmetric double expoetial,

More information

Lecture 24 Floods and flood frequency

Lecture 24 Floods and flood frequency Lecture 4 Floods ad flood frequecy Oe of the thigs we wat to kow most about rivers is what s the probability that a flood of size will happe this year? I 100 years? There are two ways to do this empirically,

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

The Method of Least Squares. To understand least squares fitting of data.

The Method of Least Squares. To understand least squares fitting of data. The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Axis Aligned Ellipsoid

Axis Aligned Ellipsoid Machie Learig for Data Sciece CS 4786) Lecture 6,7 & 8: Ellipsoidal Clusterig, Gaussia Mixture Models ad Geeral Mixture Models The text i black outlies high level ideas. The text i blue provides simple

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12 Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences. Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 23: Ifereces About Meas Eough Proportios! We ve spet the last two uits workig with proportios (or qualitative variables, at least) ow it s time to tur our attetios to quatitative variables. For

More information

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object

Last time: Moments of the Poisson distribution from its generating function. Example: Using telescope to measure intensity of an object 6.3 Stochastic Estimatio ad Cotrol, Fall 004 Lecture 7 Last time: Momets of the Poisso distributio from its geeratig fuctio. Gs () e dg µ e ds dg µ ( s) µ ( s) µ ( s) µ e ds dg X µ ds X s dg dg + ds ds

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data. 17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y. Testig Statistical Hypotheses Recall the study where we estimated the differece betwee mea systolic blood pressure levels of users of oral cotraceptives ad o-users, x - y. Such studies are sometimes viewed

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Read through these prior to coming to the test and follow them when you take your test.

Read through these prior to coming to the test and follow them when you take your test. Math 143 Sprig 2012 Test 2 Iformatio 1 Test 2 will be give i class o Thursday April 5. Material Covered The test is cummulative, but will emphasize the recet material (Chapters 6 8, 10 11, ad Sectios 12.1

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

CURRICULUM INSPIRATIONS: INNOVATIVE CURRICULUM ONLINE EXPERIENCES: TANTON TIDBITS:

CURRICULUM INSPIRATIONS:  INNOVATIVE CURRICULUM ONLINE EXPERIENCES:  TANTON TIDBITS: CURRICULUM INSPIRATIONS: wwwmaaorg/ci MATH FOR AMERICA_DC: wwwmathforamericaorg/dc INNOVATIVE CURRICULUM ONLINE EXPERIENCES: wwwgdaymathcom TANTON TIDBITS: wwwjamestatocom TANTON S TAKE ON MEAN ad VARIATION

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

ECE 901 Lecture 13: Maximum Likelihood Estimation

ECE 901 Lecture 13: Maximum Likelihood Estimation ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman: Math 224 Fall 2017 Homework 4 Drew Armstrog Problems from 9th editio of Probability ad Statistical Iferece by Hogg, Tais ad Zimmerma: Sectio 2.3, Exercises 16(a,d),18. Sectio 2.4, Exercises 13, 14. Sectio

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information