Stat 328 Formulas 8 " and if the data set constitutes an entire population of interest,the population mean is R "

Size: px
Start display at page:

Download "Stat 328 Formulas 8 " and if the data set constitutes an entire population of interest,the population mean is R ""

Transcription

1 Stat 2 Formulas Quantiles Roughly speaking, the : quantile of a distribution is a number UÐ:Ñ such that a fraction : of the distribution is to the left and a fraction Ð" :Ñ is to the right of the number. There are various possible conventions for making this notion precise for an empirical distribution/data set. One is Þ& that for ordered data values C ŸC ŸáŸC the th ordered value is the quantile, i.e. " C œuð (and other quantiles are gotten by interpolation). Þ& A particularly important quantile is UÐÞ&Ñ, the distribution median. This is a number that puts half of the distribution to its left and half to its right. Ñ Mean, Variance, and Standard Deviation For a data set CßCßáßC " the sample mean is Cœ " C " and if the data set constitutes an entire population of interest,the population mean is R ". œ C R " Further, the sample variance is " = œ " ÐC CÑ " and the corresponding population variance is R " 5 œ ÐC. Ñ R " The square root of the variance is in the original units and iscalled the standarddeviation. -1-

2 Inference Based on the Extremes of a Single Sample If one adopts a probability model that says that observations areindependent random draws from a fixed continuousdistribution/population/universe it is easy to see how to make some simple inferences based on the sample extremes. For data C" ŸC ŸáŸC, the interval ÐC" ßCÑ can serve as an intervalmeant to bracket the distribution median or as an interval meant to bracketa single additional value drawn from the distribution. As a confidence interval for UÐÞ&Ñ, the interval ÐC ßCÑhas associated confidence/reliability " ÐÞ&Ñ As a prediction interval for C " (a single additional observation from this distribution) the appropriate associated confidence/reliability is " " " Calculation of Normal Probabilities Areas under the normal curve with. œ! and 5 œ" (" standard normal probabilities'' between! and D) are given in Table 1, Appendix C, page 12 of the text. The symmetry of the standard normal curve about! then allows one to find arbitrary standard normal probabilities using the table. Areas under the normal curve with mean. and standard deviation 5 are available by converting values on the original scale to values on the" D scale'' via Dœ C. 5 and then using the standard normal table. Many handheldcalculators will compute normal probabilities. Normal Plotting It is extremely useful to be able to model a data-generating mechanism as"independent random draws from a normal distribution.'' A means ofinvestigating the extent to which this is sensible is to make a so-called" normal plot'' of a data set. This is a plot that facilitates comparisonof data quantiles and normal quantiles. If these are roughly linearlyrelated, the normal model is plausible. If they are not, the normal model is implausible. For data C" ŸC ŸáŸCand UÐ:Ñ D the standardnormal quantile function ( UÐ:Ñ D is the number that puts standard normal probability : to its left), a normal plot can be made by plotting points ÐC ßUÐ D Þ& ÑÑ -2-

3 One- and Two-Sample Intervals for Normal Distributions If it is plausible to think of CßCßáßC " as random draws from a single normal distribution, then probability theory can be invoked to support quantitative inference. Several useful estimation and prediction results (based on theory for derived distributions of appropriate statistics) are as follows. Confidence limits for the distribution mean. are C > = È where > is a quantile of the so-called " > distribution with / œ " degrees of freedom.'' (The > distributions are tabled in Table 2, Appendix C, page 1 of the text.) Prediction limits for C ", a single additional observation from the distribution are C >= 1 1 Ê Confidence limits for the distribution standard deviation 5 are " " = Ë and = ; Ë ; upper lower where ; upper and ; lower are upper and lower quantiles of the so-called "; distribution with / œ " degrees of freedom.'' (The ; distributions are tabled in Table 10, Appendix C, pages 27 and 2 of the text.) If it is plausible to think of two samples as independently drawn from two normal distributions with a common variance (this might be investigated by normal plotting both samples on a single set of axes, looking for two linear plots with comparable slopes), probability theory can again be invoked to support quantitative inferences for the difference in means. That is, for = œ Ð "Ñ= Ð "Ñ= Ð "Ñ Ð "Ñ P Ë " " " a "pooled estimate'' of 5 (based on a weighted average of the two sample variances), confidence limits for.". are C C >= PÊ where > is a quantile of the > distribution. "

4 If one drops the equal variance assumption (but maintains that two samples are independently drawn from two normal distributions) probability theory can again be invoked to support quantitative inferences for the ratio of standard deviations. That is, confidence limits for 5" Î5 are = " = and = ÈJ = " " "ß " " ÈJ "ß " " where J/ / is an upper quantile of the so-called " J distribution with numerator degrees of " ß / " freedom and / denominator degrees of freedom.'' (The J distributions are tabled in Tables -6, Appendix C, pages of the text.) One- and Two-Sample Testing for Normal Distributions If it is plausible to think of CßCßáßC " as random draws from a single normal distribution, then probability theory can be invoked to support hypothesis testing. That is, H: 0. œ can be tested using the test statistic Xœ C = È and a > " reference distribution. Further, H: 0 5 œ can be tested using the test statistic and a ; " reference distribution. 2 \ œ Ð "Ñ= 2 If it is plausible to think of two samples as independently drawn from two normal distributions with a common variance, probability theory can again be invoked to support testing for the difference in means. That is, H: œ can be tested using the test statistic Xœ C C = and a > " reference distribution. PÉ 1 2 And if one drops the equal variance assumption (but maintains that two samples are independently drawn from two normal distributions) probability theory can again be invoked to support testing for the ratio of standard deviations. That is, H: 0 5" Î5 œ" can be tested using the statistic and an J "ß " " reference distribution. Jœ = =

5 Simple Linear Regression Model The basic (normal) "simple linear regression" model says that a response/output variable C depends on an explanatory/input/system variable B in a "noisy but linear" way. That is, one supposes that there is a linear relationship between Band mean C,. ClB œ "! " " B and that (for fixed B) there is around that mean a distribution of C that is normal. Further, the standard assumption is that the standard deviation of the response distribution is constant in B. In symbols it is standard to write Cœ "! " " B % where % is normal with mean! and standard deviation 5. This describes one C. Where several observations C with corresponding values B are under consideration, the assumption is that the C (the % ) are independent. (The % are conceptually equivalent to unrelated random draws from the same fixed normal continuous distribution.) The model statement in its full glory is then C œ "! " " B % for ßßÞÞÞß % for ßßÞÞÞß are independent normal Ð!ß5 Ñrandom variables The model statement above is a perfectly theoretical matter. One can begin with it, and for specific choices of "! ß" " and 5find probabilities for C at given values of B. In applications, the real mode of operation is instead to take data pairs ÐB" ßCÑßÐB " ßCÑßÞÞÞßÐB ßCÑ and use them to make inferences about the parameters "! ß "" and 5 and to make predictions based on the estimates (based on the empirically fitted model). Descriptive Analysis of Approximately Linear ÐBßCÑ Data After plotting ÐBßCÑ data to determine that the "linear in Bmean of C" model makes some sense, it is reasonable to try to quantify "how linear" the data look and to find a line of "best fit" to the scatterplot of the data. The sample correlation between C and B <œ! ÐB BÑÐC CÑ Ë! ÐB BÑ! ÐC CÑ is a measure of strength of linear relationship between Band C. Calculus can be invoked to find a slope " and intercept " minimizing the sum of squared "! vertical distances from data points to a fitted line,! ÐC Ð" " BÑÑ. These " least squares"! " -5-

6 values are, œ "! ÐB BÑÐC CÑ! ÐB BÑ and, œc,b! " It is further common to refer to the value of C on the "least squares line" corresponding to B as a fitted or predicted value C œ,,b s! " One might take the difference between what is observed ( C ) and what is "predicted" or "explained" ( Cs ) as a kind of leftover part or " residual" corresponding to a data value / œc Cs The sum! ÐC CÑ is most of the sample variance of the values C. It is a measure of raw variation in the response variable. People often call it the " total sum of squares" and write / The sum of squared residuals! WW>9> œ " ÐC CÑ is a measure of variation in response remaining unaccounted for after fitting a line to the data. People often call it the " error sum of squares" and write WWIœ "/ œ " ÐC CÑ s One is guaranteed that WWX9> WWI. So the difference WWX9> WWI is a non-negative measure of variation accounted for in fitting a line to ÐBßCÑ data. People often call it the " regression sum of squares" and write WWVœWW>9> WWI The coefficient of determination expresses WWV as a fraction of WWX9> and is V œ WWV WWX9> which is interpreted as "the fraction of raw variation in C accounted for in the model fitting process." -6-

7 Parameter Estimates for SLR The descriptive statistics for ÐBßCÑ data can be used to provide "single number estimates" of the (typically unknown) parameters of the simple linear regression model. That is, the slope of the least squares line can serve as an estimate of " ", "s œ, " " and the intercept of the least squares line can serve as an estimate of "!, "s œ,!! The variance of C for a given Bcan be estimated by a kind of average of squared residuals " WWI = œ " / œ Of course, the square root of this "regression sample variance" is number estimate of 5. =œ È = and serves as a single Interval-Based Inference Methods for SLR The normal simple linear regression model provides inference formulas for model parameters. Confidence limits for 5 are = Ë and = ; Ë ; upper where ; upper and ; lower are upper and lower quantiles of the ; distribution with / œ degrees of freedom. And confidence limits for " " (the slope of the line relating mean C to B... the rate of change of average C with respect to B) are, > " = Ë! ÐB BÑ where > is a quantile of the > distribution with / œ degrees of freedom. Confidence limits for. ClB œ"! " " B (the mean value of C at a given value B) are " ÐB BÑ Ð,,BÑ >=! " Í! ÐB BÑ Ì (Note that by choosing, this formula provides confidence limits for, though this Bœ! "! parameter is rarely of independent practical interest.) lower -7-

8 Prediction limits for an additional observation C at a given value B are " ÐB BÑ Ð,,BÑ >=! " Í"! ÐB BÑ Ì Hypothesis Tests and SLR The normal simple linear regression model supports hypothesis testing. H: 0 " " œ can be tested using the test statistic Xœ, " = Ë ÐB! BÑ and a > reference distribution. H: 0. ClB œ can be tested using the test statistic Xœ Ð,,BÑ! " " = Ë ÐB BÑ! ÐB BÑ and a > reference distribution. ANOVA and SLR The breaking down of WWX9> into WWV and WWI can be thought of as a kind of " analysis of variance" in C. That enterprise is often summarized in a special kind of table. The general form is as below. ANOVA Table (for SLR) Source SS df MS F Regression WWV " QWVœWWVÎ" JœQWVÎQWI Error WWI QWIœWWIÎÐ Ñ Total WW>9> " In this table the ratios of sums of squares to degrees of freedom are called "mean squares." The mean square for error is, in fact, the estimate of 5 (i.e. QWIœ= ). As it turns out, the ratio in the "F" column can be used as a test statistic for the hypothesis H: 0 " " œ 0. The reference distribution appropriate is the J "ß distribution. As it turns out, the value JœQWVÎQWI is the square of the > statistic for testing this hypothesis, and the J test produces exactly the same :-values as a two-sided > test. --

9 Standardized Residuals and SLR The theoretical variances of the residuals turn out to depend upon their corresponding B values. As a means of putting these residuals all on the same footing, it is common to "standardize" them by dividing by an estimated standard deviation for each. This produces standardized residuals / œ = Ë " / " ÐB BÑ! ÐB BÑ These (if the normal simple linear regression model is a good one) "ought" to look as if they are approximately normal with mean! and standard deviation ". Various kinds of plotting with these standardized residuals (or with the raw residuals) are used as means of "model checking" or "model diagnostics." Multiple Linear Regression Model The basic (normal) "multiple linear regression" model says that a response/output variable C depends on explanatory/input/system variables BßBßáßB " 5 in a "noisy but linear" way. That is, one supposes that there is a linear relationship between BßBßáßB and mean C, " 5., œ " " B " B â " B ClB ßB ßá B! " " 5 5 " 5 and that (for fixed BßBßáßB " 5) there is around that mean a distribution of C that is normal. Further, the standard assumption is that the standard deviation of the response distribution is constant in BßBßáB. In symbols it is standard to write " 5 Cœ "! " " B" " B â " 5B5 % where % is normal with mean! and standard deviation 5. This describes one C. Where several observations C with corresponding values B" ßBßáßB5 are under consideration, the assumption is that the C (the % ) are independent. (The % are conceptually equivalent to unrelated random draws from the same fixed normal continuous distribution.) The model statement in its full glory is then C œ "! " " B" " B â " 5B5 % for ßßÞÞÞß % for ßßÞÞÞß are independent normal Ð!ß5 Ñrandom variables The model statement above is a perfectly theoretical matter. One can begin with it, and for specific choices of "! ß" " ß" ßáß" 5 and 5find probabilities for C at given values of BßBßáßB " 5. In applications, the real mode of operation is instead to take data vectors ÐB"" ßB" ßáßB5" ßCÑßÐB " " ßB ßáßB5 ßCÑßÞÞÞßÐB " ßBßáßB5ßCÑ and use them to make inferences about the parameters "! ß" "," ßáß" 5 and 5 and to make predictions based on the estimates (based on the empirically fitted model). -9-

10 Descriptive Analysis of Approximately Linear ÐB ßBßáßBßCÑ " 5 Data Calculus can be invoked to find coefficients "! ß" " ß" ßáß" 5 minimizing the sum of squared vertical distances from data points in Ð5 "Ñ dimensional space to a fitted surface,! ÐC Ð" " B " B â " B ÑÑ. These " least squares" values DO NOT have! " " 5 5 simple formulas (unless one is willing to use matrix notation). And in particular, one can NOT simply somehow use the formulas from simple linear regression in this more complicated context. We will call these minimizing coefficients,ß,ß,ßáß,! " 5 and need to rely upon JMP to produce them for us. It is further common to refer to the value of C on the "least squares surface" corresponding to B" ßBßáßB5 as a fitted or predicted value sc œ,!,b " ",B â,b 5 5 Exactly as in SLR, one takes the difference between what is observed ( C ) and what is "predicted" or "explained" ( Cs ) as a kind of leftover part or " residual" corresponding to a data value / œc Cs The total sum of squares, WWX9> œ! ÐC CÑ, is (still) most of the sample variance of the values C and measures raw variation in the response variable. Just as in SLR, the sum of / squared residuals! is a measure of variation in response remaining unaccounted for after fitting the equation to the data. As in SLR, people call it the error sum of squares and write WWIœ "/ œ " ÐC CÑ s (The formula looks exactly like the one for SLR. It is simply the case that now Cs is computed using all 5 inputs, not just a single B.) One is still guaranteed that WWX9> WWI. So the difference WWX9> WWI is a non-negative measure of variation accounted for in fitting the linear equation to the data. As in SLR, people call it the regression sum of squares and write WWVœWW>9> WWI The coefficient of (multiple) determination expresses WWV as a fraction of WWX9> and is V œ WWV WWX9> which is interpreted as "the fraction of raw variation in C accounted for in the model fitting process." This quantity can also be interpreted in terms of a correlation, as it turns out to be the square of the sample linear correlation between the observations C and the fitted or predicted values Cs. -10-

11 Parameter Estimates for MLR The descriptive statistics for ÐB" ßBßáßBßCÑ 5 data can be used to provide "single number estimates" of the (typically unknown) parameters of the multiple linear regression model. That is, the least squares coefficients,ß,ß,ßáß,! " 5 serve as estimates of the parameters "! ß" " ß" ßáß " 5. The first of these is a kind of high-dimensional "intercept" and in the case where the predictors are not functionally related, the others serve as rates of change of average C with respect to a single B, provided the other B's are held fixed. The variance of C for a fixed values BßBßáßB " 5 can be estimated by a kind of average of squared residuals " WWI = œ " / œ 5 " 5 " The square root of this "regression sample variance" is estimate of 5. =œ È = and serves as a single number Interval-Based Inference Methods for MLR The normal multiple linear regression model provides inference formulas for model parameters. Confidence limits for 5 are 5 " 5 " = Ë and = ; Ë ; upper where ; upper and ; lower are upper and lower quantiles of the ; distribution with / œ 5 " degrees of freedom. Confidence limits for " 4 (the rate of change of average C with respect to B 4 ) are lower, > astandard error of, b 4 4 where > is a quantile of the > distribution with / œ 5 " degrees of freedom. There is no simple formula for "standard error of, 4 " and in particular, one can NOT simply somehow use the formula from simple linear regression in this more complicated context. It IS the case that this standard error is a multiple of =, but we will have to rely upon JMP to provide it for us. Confidence limits for. ClB ßB ßáB "! "" " " " ", œ B B â 5 5B 5 (the mean value of C at a particular choice of the BßBßáßB ) are " 5 sc >Ðstandard error of sc) There is no simple formula for "standard error of Cs " and in particular one can NOT simply somehow use the formula from simple linear regression in this more complicated context. It IS the case that this standard error is a multiple of =, but we will have to rely upon JMP to provide it for us. -11-

12 Prediction limits for an additional observation C at a given vector È sc > = Ðstandard error of sc) ÐB" ßBßáßBÑ 5 are Hypothesis Tests and MLR The normal multiple linear regression model supports hypothesis testing. H: 0 " 4 œ can be tested using the test statistic Xœ, 4 standard error of, 4 and a > 5 " reference distribution. H: can be tested using the test statistic and a > 5 " reference distribution. 0. ClB ßB ßáB, œ Xœ " 5 s C standard error of Cs ANOVA and MLR As in SLR, the breaking down of WWX9> into WWV and WWI can be thought of as a kind of " analysis of variance" in C, and summarized in a special kind of table. The general form for MLR is as below. ANOVA Table (for MLR Overall F Test) Source SS df MS F Regression WWV 5 QWVœWWVÎ5 JœQWVÎQWI Error WWI 5 " QWIœWWIÎÐ 5 "Ñ Total WW>9> " (Note that as in SLR, the mean square for error is, in fact, the estimate of 5 (i.e. QWIœ= ).) As it turns out, the ratio in the "F" column can be used as a test statistic for the hypothesis H: 0 "" œ" œâœ" 5 œ 0. The reference distribution appropriate is the J 5ß 5 " distribution. "Partial F Tests" in MLR It is possible to use ANOVA ideas to invent F tests for investigating whether some whole group of "'s (short of the entire set) are all!. For example one might want to test the hypothesis H: 0 ": " œ ": œâœ " 5 œ 0-12-

13 (This is the hypothesis that only the first : of the 5 input variables B have any impact on the mean system response... the hypothesis that the first : of the B's are adequate to predict C... the hypothesis that after accounting for the first : of the B's, the others do not contribute "significantly" to one's ability to explain or predict C.) If we call the model for C in terms of all 5 of the predictors the "full model" and the model for C involving only B" through B: the "reduced model" then an J test of the above hypothesis can be made using the statistic Jœ ÐWWVFull WWVReducedÑÎÐ5 :Ñ QWI and an J5 :ß 5 " reference distribution. WWV Full WWVReduced so the numerator here is nonnegative. Finding a :-value for this kind of test is a means of judging whether V for the full model is "significantly"/detectably larger than V for the reduced model. (Caution here, statistical significant is not the same as practical importance. With a big enough data set, essentially any increase in V will produce a small :-value.) It is reasonably common to expand the basic MLR ANOVA table to organize calculations for this test statistic. This is (Expanded) ANOVA Table (for MLR) Source SS df MS F Regression WWV 5 QWV œwwvî5 JœQWV ÎQWI BßÞÞÞßB WWV : " : B: " ßÞÞÞßBlB 5 " ßÞÞÞßB: WWVFull WWVRed 5 : ÐWWVFull WWVRedÑÎÐ5 :Ñ Error WWIFull 5 " QWIFull œwwifullîð 5 "Ñ Total WW>9> " Full Full Full Full Full Red ÐWWVFull WWVRedÑÎÐ5 :Ñ QWI Full Standardized Residuals in MLR As in SLR, people sometimes wish to standardize residuals before using them to do model checking/diagnostics. While it is not possible to give a simple formula for the "standard error of / " with using matrix notation, most MLR programs will compute these values. The standardized residual for data point is then (as in SLR) / œ / standard error of / If the normal multiple linear regression model is a good one these "ought" to look as if they are approximately normal with mean! and standard deviation ". -1-

14 Intervals and Tests for Linear Combinations of "'s in MLR It is sometimes important to do inference for a linear combination of MLR model coefficients Pœ-! "! -" "" - " â -5" 5 (where -ß-ßáß-! " 5 are known constants). Note, for example, that. ClB " ßB ßá, B is of this form for 5 -! œ"ß-" œbß- " œbßáß and -5 œb5. Note too that a difference in mean responses at two w w w sets of predictors, say ÐBßBßáßBÑ " 5 and ÐBßBßáßBÑ " 5 is of this form for w w w - œ!ß- œb Bß- œb Bßáßand - œb B.! " " " An obvious estimate of P is Confidence limits for P are Pœ s -, -, -, â -,!! " " 5 5 P >Ð s standard error of PÑ s There is no simple formula for "standard error of PÞ s" This standard error is a multiple of =, but we will have to rely upon JMP to provide it for us. (Computation of Ps and its standard error is under the "Custom Test" option in JMP.) H: 0 Pœ can be tested using the test statistic Xœ P s standard error of Ps and a > 5 " reference distribution. Or, if one thinks about it for a while, it is possible to find a reduced model that corresponds to the restriction that the null hypothesis places on the MLR model and to use a ":œ5 "" partial J test (with " and 5 " degrees of freedom) equivalent to the > test for this purpose. MLR Model-Building The MLR model and inference formulas above form the core of a set of tools in common use for building reliable predictions of C from a large set of predictors BßáßB " 5. There are a number of extensions and ways of using these tools (and their extensions) that combine to form a full model-building technology. Some of the additional ideas are summarized below. Diagnostic Tools Residual Plotting The residuals / œc Cs from MLR are meant to be empirical approximates of the "random errors" % œc. ClB ßB ßá, B -14- " 5 in the MLR model. The MLR model says that the % are normal (with mean 0 and standard deviation 5) and independent. So one should expect the residuals to be

15 describable in approximately these terms. They should be essentially "patternless normal random noise" and if they aren't, a problem with the corresponding MLR model is indicated. (This possibility then causes one to be skeptical of the appropriateness of any probability-based inferences based on the MLR model.) Common ways of looking at MLR residuals (or their standardized/studentized versions / ) are to 1) normal-plot them hoping to see an approximately linear plot, and 2) plot them against any variables of interest (like, for example, BßáßBßCs " 5 or C, time order of observation, values of any variable potentially of importance but not included in the model, etc.) looking for a pattern that can simultaneously suggest a problem with a current model and identify possible remedial measures. Chapter 7 of the text discusses residual plotting in some detail, and we'll return to this topic later. Diagnostic Measures/Statistics A Pooled Sample Standard Deviation and a Lack-of-Fit F Test In problems where there are one or more ÐB" ßBßáßBÑ 5 vectors that have multiple responses C, it is possible to make an estimate of 5 that doesn't depend for its appropriateness on the particular form of the relationship between BßBßáßB " 5 and mean C used in the MLR model. That is, if there are 1 groups of C's each coming from a single ÐB" ßBßáßBÑ 5 combination and having a group sample variance = 4, then one can make a kind of "pooled standard deviation" from these as = œ Ð "Ñ= Ð "Ñ= â Ð "Ñ= Pooled Ë " " 1 " 1 Ð "Ñ Ð "Ñ â Ð "Ñ Provided that 5 doesn't change with BßBßáßB " 5, this is a legitimate estimate of 5, regardless of whether or not one has an appropriate form for. ClB ßB ßá B. On the other ", 5 hand, the MLR sample standard deviation = ( œ ÈQWI) will tend to overestimate 5 if. ClB ßB ßá B "! "" " " " ", Á B B â 5 5B 5. So informal comparison of = to = Pooled is a means of doing model diagnosis (a large difference being indicative of poor model fit). 1 This can be made more formal by inventing a related J error sum of squares is sometimes broken down as test statistic. That is, the MLR Pooled Pooled WWIœÐ 1Ñ= ÐWWI Ð 1Ñ= Ñ and the (nonnegative) terms of the right of this equation given the names WWTI ("pure error" sum of squares ) and WWP9J ("lack of fit" sum of squares). That is WWTIœÐ 1Ñ= Pooled -15-

16 and In this notation, the statistic WWP9J œwwi WWTI Jœ WWP9JÎÐ 5 "Ñ Ð 1Ñ a b WWTIÎÐ 1Ñ is an index of whether the two estimates of 5 are detectably different, with large values corresponding to cases where = is much larger than = Pooled. Then, as it turns out, under the MLR model this statistic has an J 1 5 "ß 1 reference distribution and there is a formal :-value to be associated with a disparity between = and = Pooled (the J1 5 "ß 1 probability to the right of the observed value). People sometimes even go so far as to add a pair of lines to the MLR ANOVA table, breaking down the "Error" source (and correpsonding sum of squares and df) into "LoF" and "Pure Error" components. V and =œèqwi As one searches through myriad possible MLR models potentially describing the relationship between predictors Band a response, C, one generally wants a model with a "small" number of predictors (a simple or parsimonious model), a "large" value of V and a "small" value of =. The naive "solution" to the model search problem of just "picking the biggest possible model" is in fact no solution, in light of the potential for "overfitting" a data set (adopting a model with too many "wiggles" that can nicely reproduce the C's in the data set, but that does a very poor job when used to produce mild extrapolations or interpolations). Other Functions of WWI For a given prediction problem (and therefore fixed WWX9> ), V and =œ QWI are "equivalent" in the sense that one could be obtained from the other (and WWX9> ). They don't, however, necessarily produce the same ordering of reduced models of some grand MLR model in terms of "best looking" values of the criteria (e.g. the full model has the largest V but may not have the smallest = ). There are several other functions of WWI that have been suggested as possible statistics for model diagnostics/selection. Among them are Mallows' and Akaike's Information Criterion (the AIC). G : È Mallows' G : is based on the fact that the average total squared difference between the values Cs from a MLR fit and their (real) means. can be worked out theoretically. When divided by 5, this quantity is a quantity > : that is Ð: "Ñ when there is a choice of "'s in a :-predictor MLR model that produces correct means for all data points and is otherwise larger. Mallows' suggestion for comparing reduced versions of a full ( 5- predictor) MLR model, is that for a reduced model with :Ÿ5predictors, one compute G œ : WWI QWI Red Full Ð: "Ñ -16-

17 (an estimate of > : if the full 5 -variable MLR model is correct) and look for small : and G: no more than about Ð: "Ñ. The thinking is that such a reduced model is simpler than the full model and appears to produce predictions comparable to the full model predictions at the ÐB ßBßáßBÑvectors in the data set. " 5 Akaike's Information Criterion is based on considerations beyond the scope of this exposition. For a MLR model with 5 predictors, it is EMG œlnœ and people look for small values of EMG. Press Statistic WWI Ð5 "Ñ This is a diagnostic measure built on the notion that a model should not be terribly sensitive to individual data points used to fit it, or equivalently that one ought to be able to predict a response even without using that response to fit the model. Beginning with a particular form for a MLR model and data points, let sc ÐÑ œ the value of C predicted by a model fit to the other Ð "Ñ data points (note that this is not necessarily Cs ). The "prediction sum of squares" is and one wants small values of this. Model Search Algorithms TVIWWœ " ÐC C s Ñ ÐÑ Given a particular full model (a particular set of 5 predictors), to do model selection one needs some computer-assisted means of "poking around" in the (often very large) set of possible reduced models looking for good reduced models. Statistical packages must offer some methods for this. Probably the best such methodology is of the "all possible regressions" variety. This kind of routine will, for a given set of 5 predictors and a choice of, produce a list of the top models with " predictor, the top models with predictors,..., the top models with Ð5 "Ñ predictors. (Slick computational schemes make this possible despite the fact that the number of models to be checked grows astronomically with the size of the full model, 5.) An older (and really, inferior and completely ad hoc) methodology attempts to "add in" or "drop out" variables in regression models one at a time on the basis of :-values for ( > or J) tests of hypotheses that their regression coefficients are!. (An algorithm adds in or drops out the most obvious predictor at each step.) This kind of "stepwise regression" methodology can be run in a purely "backwards elimination" mode (that begins with the full model and successively drops single variables), in a purely "forward selection" mode (that begins with a model containing only the predictor most highly correlated with C ) or in a "mixed" mode that at any step can either add or drop a predictor variable depending upon the :-values for adding and dropping. JMP has implemented stepwise regression as its model searching tool. -17-

18 The reason that stepwise searching is inferior to all-possible-regressions searches is that when one by some stepwise means or another gets to a particular :-variable model, one is NOT guaranteed that such a model is even the "best" one available of size : according to an V (or any other) criterion. Only the exhaustive search provided by an all-possible-regressions algorithm produces such a guarantee. Creating New Variables from Exisiting Ones Where neither a full MLR model for C in terms of all available predictors BßáßB " 5, nor any reduction of it is satisfactory, one is far from "out of tricks to pull" in the quest to find a useful means of predicting C. One obvious possibility is to replace C and/or one or more of the B's with "transformed" versions of themselves. One might take square roots or logarithms (or?????) of the responses and/or one or more of the predictors, do the modeling and inference and "untransform" back to original scales of measurement in order to interpret the inferences (by squaring or exponentiating, or?????). Another possibility is to fit models that are not simply linear in the predictor(s) but quadratic, or cubic, or... For example, the full quadratic MLR regression model for C in the 5œ predictors B and B " is! " " $ " % & " Cœ " " B " B + " B " B " BB % For a singlepredictor B, JMP will do the fitting of a polynomial for C under the Fit Y by X menu. For 5 different B's, one needs to create powers of individual predictors and cross product terms by using the "cross" button to add them to the "Effects in Model" portion of the dialog box (or to use the "Response Surface" macro to fill that portion of the box after highlighting the original predictors in the list of columns) under the JMP Fit Model menu. The appearance of the cross product term in the quadratic model above raises the possibility of using predictors that are functions of more than one of a set of basic variables BßáßB " 5. Such terms are often called "interaction" terms. A model without interaction terms is sometimes called "additive" in that a mean response is gotten by simply adding to an overall "! the separate contributions due to each of the 5 predictors. For an additive model (one without interactions) plots of mean C against any one of the predictors, say B 4, are parallel for different sets of the other B's. Models with interactions have plots of mean C verus an B 4 involved in an interaction that are NOT parallel. Qualitative Factors/Inputs and Dummy Variables At first look, it would seem that MLR has nothing to say about problems where some or all of the basic system inputs that determine the nature of a response are qualitative rather than quantitative. But as a matter of fact, with the proper amount of cleverness, it's possible to put even qualitative factors into the MLR framework. Consider a factor, call it A, that has M possible "levels" or settings. (A could, for example, be something like employee genderwith Mœ.) It is then possible to represent A in MLR notation through the creation of M "dummy variables. That is, one defines -1-

19 B œ " if the observation is from level " of A A" œ 0 otherwise B œ " if the observation is from level of A A œ 0 otherwise ã B œ " if the observation is from level M " of A AßM " œ 0 otherwise Then, the model Cœ " " B " B â " B %! " A" A M " AßM " says that observations are normal with standard deviation 5 and mean "! "" if observation is from level " of A "! " if observation is from level of A ã ã "! " M " if observation is from level M " of A " if observation is from level M of A! All of the MLR machinery is available to do inference for the "'s and sums and differences thereof (that amount to means and differences in mean responses under various levels of A). The approach above is the one taken in the textbook. But other (equivalent) versions of this business are possible. Two are of special interest because of the way the JMP does its automatic coding for qualitative "nominal" and "ordinal" variables. That is, a first alternative to the method above is to do what JMP seems to do for " nominal" variables. Define M " variables B w A" Ú " if the observation is from level " of A œ Û " if the observation is from level M of A Ü! otherwise B w A2 Ú " if the observation is from level of A œ Û " if the observation is from level M of A Ü! otherwise ã B w AßM " Ú " if the observation is from level M " of A œ Û " if the observation is from level M of A Ü! otherwise -19-

20 The model w w w! " A" A M " AßM " Cœ " " B " B â " B % then says that observations are normal with standard deviation 5 and mean "! "" "! " if observation is from level " of A if observation is from level of A ã ã " " if observation is from level M " of A "! M " M "! Œ! " if observation is from level M of A With this coding, the sum of the means is M"! and thus "! is the arithmetic average of the M means. The other "'s are then deviations of the "first" M " means from this arithmetic average of the M means. A third version of this is what JMP seems to do for "ordinal" variables. Define M " variables ww B œ " if the observation is from level of A A2 œ 0 otherwise ww B œ " if the observation is from level or $ of A A$ œ 0 otherwise ã ww B œ " if the observation is from level ß$ßáß or M of A AßM œ 0 otherwise Then, the model ww ww ww! 2 A $ A$ M AßM Cœ " " B " B â " B % says that observations are normal with standard deviation 5 and mean "! if observation is from level " of A "! " if observation is from level of A "! " " $ if observation is from level $ of A ã ã "! M!" if observation is from level M of A œ The "intercept" here is the mean for the first level of A and the other "'s are the differences between means for successive levels of A (in the order " through M). -20-

21 Once one has seen this idea of using M " dummies to represent a single qualitative factor with M levels, it is easy to go on and include more than one qualitative factor in a model (through the use of a second set of dummies) and to create interactions involving qualitative factors (by taking products with all of its dummies), etc. It is worth considering in detail what one gets from using dummy variables where there are two qualitative factors and all possible combinations of levels of those factors are represented in the data set. (The standard jargon for "all possible combinations of M levels of A and N levels of B represented in a data set" is that the data contain a "(full) two-way factorial in the factors A and B." The figure below shows the M N different combinations of levels of the factors laid out in a table, with "cell mean" responses filling the cells. Factor A Factor B " N ".. â... â. ã ã ã M.. â. "" " "N " N M" M MN It is reasonable to ask what dummy variables can provide in terms of modeling a response in this context. Since it is the most sensible coding provided automatically by your software, let us w consider the B coding above, instead of the Bcoding discussed in Chapter 5 of your text in what follows. For ßßÞÞÞß " let B w Ai and for 4œ"ßßÞÞÞß4 " let B w B4 Ú " if the observation is from level of A œ Û " if the observation is from level M of A Ü! otherwise Ú " if the observation is from level 4of B œ Û " if the observation is from level N of B Ü! otherwise A MLR regression model for response C (first) involving only the dummies themselves is! A" w A" A w A AßM " w AßM " B" w B" B w B B, N " w B ßN " (*) Cœ " " B " B â " B " B " B â " B % This (no-interactions) model says that for ŸM " and 4ŸN ". 4 œ "! " A " B4-21-

22 for 4 ŸN " M ". M4 œ "! " " A " B4 for ŸM " N " N! " B 4 4œ". œ " " A " and that M " N ". MN œ "! " " A " " B4 4œ" The no interactions model says that with level of A (B) held fixed, as one moves across levels of B (A), the mean responses are changed by adding different "'s to "!, and that the same addition would be done on every fixed level of A (B). There are "parallel traces of means" as one moves across levels of B (A) for the different levels of A (B). Consider too what one gets from averaging means across rows and down columns in the two way table. Letting a dot subscript indicate that one has averaged out over the missing subscript, one can extend the table above to get Factor B " N ".""." â."n."þ Factor A."..N.Þ ã ã ã ã M. M". M â. MN. MÞ.. â.. Þ" Þ ÞN ÞÞ Adding the above expressions for the. 4 in terms of the "'s across a row and dividing by N it becomes clear that for ŸM ", " œ.. A Þ ÞÞ the difference between the row average mean and the grand mean. Similarly, adding the above expressions for the. 4 in terms of the "'s down a column and dividing by M it becomes clear that for 4ŸN ", " œ.. B4 Þ4 ÞÞ the difference between the column average mean and the grand mean. These functions of the means. 4 are common summaries of a complete two-way table of M N means and standard jargon is that the "main effect of A at its th level" œ.. Þ ÞÞ -22-

23 while the "main effect of B at its 4th level" œ.. Þ4 ÞÞ Our exposition here says that the regression coefficients " with the JMP "nominal" coding of qualitative factors are (at least in the no-interaction model) exactly the factor main effects. Note that for the last level of the factors, the fact that the. Þ. ÞÞ (and the. Þ4. ÞÞ)sum to zero means that the main effect for the last level of the factors is the negative sum of the other main effects. Now consider what happens when one adds to the no-interaction model (*) all cross products of w w B and B terms. One then has a model with number of predictors A B4 " 5" œðm "Ñ ÐN "Ñ ÐM "ÑÐN "Ñ œmn " It should not then be completely surprising that such a model then allows for any possible choice of the MN means. 4. That is, the model with all the A dummies, the B dummies and the products of A and B dummies in it, is really equivalent to starting with MN levels of a single omnibus factor and making up MN " dummies from scratch. The advantage of using the present coding (instead of starting all over and making up a new coding for a single omnibus factor) is that the cross product terms are interpretable. That is, as it turns out, a model extending (*) by including all possible cross product terms still ends up implying that for ŸM " " œ.. A Þ ÞÞ and for 4ŸN ", " œ.. B4 Þ4 ÞÞ and then that for ŸM " and 4ŸN " so that for such and 4. œ " " " " 4! A B4 AB4 " AB4 œ. 4 a"! " A " B4b œ. 4 Ðmean from the no-interaction model) œ. Ðgrand mean th A main effect 4th B main effect Ñ It is common to define for all and 4 4 interaction of A at level and B at level 4œ. 4 Ðmean from the no-interaction modelñ œ. 4 Ðgrand mean th A main effect 4th B main effectñ so that the ÐM "ÑÐN "Ñ cross product "'s can be interpreted as "interactions" measuring departure from additivity/parallelism. As it turns out, the "interactions" for a given row or column sum to!, so that one can get them for the last row level M of A or level N of B as the negative sum of the others in the corresponding column or row. -2-

24 Notice that considering a full model including all A dummies, all B dummies and all AB dummy products, various reduced models have sensible interpretations. The reduced model! A" w A" A w A AßM " w AßM " B" w B" B w B B, N " w BßN " Cœ " " B " B â " B " B " B â " B % is obviously one of "no A B interactions." (This model says that there are parallel traces on a plot of mean C versus level of one of the factors.) The reduced model! A" w A" A w A AßM " w AßM " Cœ " " B " B â " B % is one of "A main effects only." (This model says that all means in a given row are the same.) And the reduced model! B" w B" B w B B, N " w BßN " Cœ " " B " B â " B % is one of "B main effects only." same.) (This model says that all means in a given column are the Of course, all of the MLR regression machinery is then available for doing everything from making confidence intervals for the factorial effects (that are "'s or linear combinations thereof) to looking at plots of mean responses on JMP, to predicting new responses at particular combinations of levels of the factors, to testing reduced models against the full model, etc. Dummy Variables and Piece-Wise Regressions Dummy variables are very useful objects. We saw above that they can be used to incorporate basically qualitative information into a MLR analysis. They can also be used to allow one to "piece together" different fitted curves defined over different regions. To illustrate what is possible, suppose that one wants to model C so that. ClB is a continuous function of B, having the properties that for 5" 5 (the known locations of so-called "knots"),. ClB is a linear function of Bfor BŸ5", a possibly different linear function of Bfor 5" ŸBŸ5, and a yet possibly different linear function of Bfor 5 ŸB. This can be done as follows. Define two dummy variables and A model with the target properties is then Notice that this model has B œ! if B 5 " " œ " if 5 ŸB B œ! if B 5 œ " if 5 ŸB Cœ "! " " B " ÐB 5ÑB " " " $ ÐB 5ÑB $ %. œ " " B for BŸ5 ClB! " " " -24-

25 and and. œð" " 5Ñ Ð" + " ÑB for 5 ŸBŸ5 ClB! " " ". œð" " 5 " 5Ñ Ð" + " " ÑB for 5 ŸB ClB! " $ " $ The same kind of thing can be done with other numbers of knots and with higher order polynomials (like, e.g., quadratics or cubics). For higher order polynomials, it can even be done in a way that forces the curve defined by. ClB to not have "sharp corners" at any knot. All of this is still in the framework provided by MLR. Diagnostic Plots and More Diagnostic Measures There are various kinds of residuals, ways of plotting them and measures of "influence" on a regression that are meant to help in the black art of model building. We have already alluded to the fact that under the MLR model, we expect ordinary residuals / œc Cs to look like mean! normal random noise and that standardized residual= / œ / standard error of / should like standard normal random noise. In the context of defining the TVIWW statistic we alluded to the notion of deleted residuals / œc Cs ÐÑ ÐÑ and the hope that if a model is a good one and not overly sensitive to the exact data vectors used to fit it, these shouldn't be ridiculously larger in magnitude than the regular residuals, /. This does not exhaust the ways in which people have suggested using the residual idea. It is possible to invent standardized/studentized deleted residuals and there are yet other possibilities. / œ / standard error of / ÐÑ ÐÑ Partial Residual Plots (JMP "Effect Leverage Plots") In somewhat nonstandard language, SAS/JMP makes what it calls "effect leverage plots" that accompany its "effect tests." These are based on another kind of residuals, sometimes called partial residuals. With 5 predictor variables, I might think about understanding the importance of variable 4 by considering residuals computed using only the other 5 " predictor variables to do prediction (i.e. using a reduced model not including B 4 ). Although it is nearly impossible to see this from their manual and help functions or how the axes of the plots are labeled, the effect leverage plot in JMP for variable 4 is a plot of ÐÑ -25-

26 Ð4Ñ / ÐC Ñœ the th C residual regressing on all predictor variables except B 4 versus Ð4Ñ / ÐB Ñœ the th B residual regressing on all predictor variables except B On this plots there is a horizontal line drawn (ostensibly at C ) that really represents C partial residual equal to 0 ( Cperfectly predicted by all predictors excepting B 4 ). (The vertical axis IS in the original C units, but should not really be labeled as C, but rather as partial residual.) The sum of squared vertical distances from the plotted points to this line is then WWI for a model without predictor 4. The horizontal plotting positions of the points are in the original B 4 units, but are partial residuals of the B4's NOT B4's themselves. The horizontal center of the plot is at B 4 partial residual of!, not at B 4 as JMP (inaccurately) represents things. The nonhorizontal line on the plots is in fact the least squares line through the plotted points. What is interesting is that the usual residuals from that least squares line are the residuals for the full MLR fit to the data. So the sum of the squared vertical distances from points to sloped line is then WWI for the full model. The larger is reduction in SSE from the horizontal line to the sloped one, the smaller the :-value for testing H :" œ!. Highlighting a point on a JMP partial residual plot makes it bigger on the other plots and highlights it in the data table (for examination or, for example, potential exclusion). We can at least on these plots see which points are fit poorly in a model that excludes a given predictor and the effect the addition of that last predictor has on the prediction of that C. (Note that points near the center of the horizontal scale are ones that have B 4 that can already be predicted from the other B's and so addition of B 4 to the prediction equation does not much change the residual. Points far to the right or left of center have values of predictor 4that are unlike their predictions from the other B's. They both tend to more strongly influence the nature of the change in the model predictions as B 4 is added to the model, and tend to have their residuals more strongly affected than points in the middle of the plot (where B might be predicted from the other B's). Leverage 4 The notion of how much potential influence a single data point has on a fit is an important one. The JMP partial residual plot/"effect leverage" plot is aimed at addressing this issue by highlighting points with large B 4 partial residuals. Another notion of the same kind is based on the fact that there are numbers 2 w ( ßáßand w œ"ßáß) depending upon the vectors ÐB" ßBßáßB5 Ñ only (and not the C's) so that each Cs is C œ2 C 2 C â2 C 2 C 2 C â 2 C! 4 s " " ß " " ß " " 2 is then somehow a measure of how heavily C is counted in its own prediction and is usually called the leverage of the corresponding data point. It is a fact that! 2 " and! 2 œ5 ". So the 2 's average to Ð5 "ÑÎ, and a plausible rule of thumb is -26-

27 that when a single 2 is more than twice this average value, the corresponding data point has an important ÐB ßB ßáßB Ñ. " 5 It is not at all obvious, but as it turns out, the TVIWW statistic has the formula / TVIWWœ!Š " 2 involving these leverage values. This shows that big TVIWW occur when big leverages are associated with large ordinary residuals. Cook's D The leverage 2 involves only predictors and no C's. A proposal by Cook to measure overall effect that point has on the regression is the statistic H œ 2 / Œ Ð5 "ÑQWI " 2 where large values of this supposedly identify points that by virtue of either their leverage or their large ordinary residual are "influential." is Cook's Distance. 0-1 Responses Sometimes the response, C, is an indicator of whether or not some event of interest has occurred H Cœ œ "! if the event occurs if the event does not occur It is possible to think about assessing the impact of some predictor variables BßBßáßB " 5 C." But ordinary regression analysis is not the right vehicle for doing so. The standard regression models have normal C 's, not!"c - 's. Here the mean of C is. ClB ßB ßáßB " 5 œtòc œ"ó œ: and is constrained to be between! and 1. The most commonly available technology for doing inference here is so-called "logistic regression," that says that the "log odds ratio is linear in BßBßáßB " 5." That is, the assumption is that : lnœ " : -27- "on œ "! " " B" " B â " 5B5 (**) and the "'s become the increase in log odds ratio for a unit increase in predictor, the other predictors held fixed. (Note that as log odds ratio increases, : increases, log odds ratio of! being :œþ&). The actual technology required to fit the relationship (**) is more complicated than least squares, and the methods of inference are based on different mathematics. But using a package like JMP, these differences are largely invisible to a user, and one can reason from the JMP report mostly by analogy to ordinary regression. Both "Fit Y by X" and "Fit Model" in JMP will automatically fit (**) if one gives it a nominal response variable. One bit of confusion that is possible here concerns the fact that JMP knows numerical order and alphabetical order. So if you ask it to do logistic regression, it will do so for : corresponding to what it considers to be the "first" category.

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics

Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model Checking/Diagnostics Regression Analysis V... More Model Building: Including Qualitative Predictors, Model Searching, Model "Checking"/Diagnostics The session is a continuation of a version of Section 11.3 of MMD&S. It concerns

More information

Stat 328 Regression Summary

Stat 328 Regression Summary Simple Linear Regression Model Stat 328 Regression Summary The basic (normal) "simple linear regression" model says that a response/output variable ` depends on an explanatory/input/system variable _ in

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Regression Analysis IV... More MLR and Model Building

Regression Analysis IV... More MLR and Model Building Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression

More information

Systems of Equations 1. Systems of Linear Equations

Systems of Equations 1. Systems of Linear Equations Lecture 1 Systems of Equations 1. Systems of Linear Equations [We will see examples of how linear equations arise here, and how they are solved:] Example 1: In a lab experiment, a researcher wants to provide

More information

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"

Time Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting serial correlation Time Series Analysis 2) assessment of/accounting for seasonality This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values,

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear

Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -

More information

STATISTICS 110/201 PRACTICE FINAL EXAM

STATISTICS 110/201 PRACTICE FINAL EXAM STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable

More information

PART I. Multiple choice. 1. Find the slope of the line shown here. 2. Find the slope of the line with equation $ÐB CÑœ(B &.

PART I. Multiple choice. 1. Find the slope of the line shown here. 2. Find the slope of the line with equation $ÐB CÑœ(B &. Math 1301 - College Algebra Final Exam Review Sheet Version X This review, while fairly comprehensive, should not be the only material used to study for the final exam. It should not be considered a preview

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

10 Model Checking and Regression Diagnostics

10 Model Checking and Regression Diagnostics 10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance

More information

Inner Product Spaces

Inner Product Spaces Inner Product Spaces In 8 X, we defined an inner product? @? @?@ ÞÞÞ? 8@ 8. Another notation sometimes used is? @? ß@. The inner product in 8 has several important properties ( see Theorem, p. 33) that

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the

More information

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17

Transformations The bias-variance tradeoff Model selection criteria Remarks. Model selection I. Patrick Breheny. February 17 Model selection I February 17 Remedial measures Suppose one of your diagnostic plots indicates a problem with the model s fit or assumptions; what options are available to you? Generally speaking, you

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Section 1.3 Functions and Their Graphs 19

Section 1.3 Functions and Their Graphs 19 23. 0 1 2 24. 0 1 2 y 0 1 0 y 1 0 0 Section 1.3 Functions and Their Graphs 19 3, Ÿ 1, 0 25. y œ 26. y œ œ 2, 1 œ, 0 Ÿ " 27. (a) Line through a!ß! band a"ß " b: y œ Line through a"ß " band aß! b: y œ 2,

More information

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010 1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

More information

In the previous chapter, we learned how to use the method of least-squares

In the previous chapter, we learned how to use the method of least-squares 03-Kahane-45364.qxd 11/9/2007 4:40 PM Page 37 3 Model Performance and Evaluation In the previous chapter, we learned how to use the method of least-squares to find a line that best fits a scatter of points.

More information

OPTIMIZATION OF FIRST ORDER MODELS

OPTIMIZATION OF FIRST ORDER MODELS Chapter 2 OPTIMIZATION OF FIRST ORDER MODELS One should not multiply explanations and causes unless it is strictly necessary William of Bakersville in Umberto Eco s In the Name of the Rose 1 In Response

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Using SPSS for One Way Analysis of Variance

Using SPSS for One Way Analysis of Variance Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial

More information

Topic 1. Definitions

Topic 1. Definitions S Topic. Definitions. Scalar A scalar is a number. 2. Vector A vector is a column of numbers. 3. Linear combination A scalar times a vector plus a scalar times a vector, plus a scalar times a vector...

More information

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Selection Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Stat 401B Final Exam Fall 2015

Stat 401B Final Exam Fall 2015 Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.

Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals. 9.1 Simple linear regression 9.1.1 Linear models Response and eplanatory variables Chapter 9 Regression With bivariate data, it is often useful to predict the value of one variable (the response variable,

More information

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10 Physics 509: Error Propagation, and the Meaning of Error Bars Scott Oser Lecture #10 1 What is an error bar? Someone hands you a plot like this. What do the error bars indicate? Answer: you can never be

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

The scatterplot is the basic tool for graphically displaying bivariate quantitative data.

The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Exam 2 Practice Problems

Exam 2 Practice Problems ST 312 Exam 2 Practice Problems MATERIAL COVERED ON EXAM 2 Reiland Lecture Handouts Corresponding textbook chapters 6: Infererence for the Difference Between Means Chapters 22, 23 7: Scatterplots, Correlation,

More information

Model Selection. Frank Wood. December 10, 2009

Model Selection. Frank Wood. December 10, 2009 Model Selection Frank Wood December 10, 2009 Standard Linear Regression Recipe Identify the explanatory variables Decide the functional forms in which the explanatory variables can enter the model Decide

More information

DR.RUPNATHJI( DR.RUPAK NATH )

DR.RUPNATHJI( DR.RUPAK NATH ) Contents 1 Sets 1 2 The Real Numbers 9 3 Sequences 29 4 Series 59 5 Functions 81 6 Power Series 105 7 The elementary functions 111 Chapter 1 Sets It is very convenient to introduce some notation and terminology

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Math 1AA3/1ZB3 Sample Test 3, Version #1

Math 1AA3/1ZB3 Sample Test 3, Version #1 Math 1AA3/1ZB3 Sample Test 3, Version 1 Name: (Last Name) (First Name) Student Number: Tutorial Number: This test consists of 16 multiple choice questions worth 1 mark each (no part marks), and 1 question

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr

Regression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics

More information

Box-Cox Transformations

Box-Cox Transformations Box-Cox Transformations Revised: 10/10/2017 Summary... 1 Data Input... 3 Analysis Summary... 3 Analysis Options... 5 Plot of Fitted Model... 6 MSE Comparison Plot... 8 MSE Comparison Table... 9 Skewness

More information

Lecture 18 Miscellaneous Topics in Multiple Regression

Lecture 18 Miscellaneous Topics in Multiple Regression Lecture 18 Miscellaneous Topics in Multiple Regression STAT 512 Spring 2011 Background Reading KNNL: 8.1-8.5,10.1, 11, 12 18-1 Topic Overview Polynomial Models (8.1) Interaction Models (8.2) Qualitative

More information

Exam Applied Statistical Regression. Good Luck!

Exam Applied Statistical Regression. Good Luck! Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.

More information

Linear Regression Models

Linear Regression Models Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,

More information

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters Objectives 10.1 Simple linear regression Statistical model for linear regression Estimating the regression parameters Confidence interval for regression parameters Significance test for the slope Confidence

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting

More information

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES 4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES FOR SINGLE FACTOR BETWEEN-S DESIGNS Planned or A Priori Comparisons We previously showed various ways to test all possible pairwise comparisons for

More information

A Scientific Model for Free Fall.

A Scientific Model for Free Fall. A Scientific Model for Free Fall. I. Overview. This lab explores the framework of the scientific method. The phenomenon studied is the free fall of an object released from rest at a height H from the ground.

More information

Applied Regression Analysis. Section 2: Multiple Linear Regression

Applied Regression Analysis. Section 2: Multiple Linear Regression Applied Regression Analysis Section 2: Multiple Linear Regression 1 The Multiple Regression Model Many problems involve more than one independent variable or factor which affects the dependent or response

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Prediction of Bike Rental using Model Reuse Strategy

Prediction of Bike Rental using Model Reuse Strategy Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu

More information

Definition Suppose M is a collection (set) of sets. M is called inductive if

Definition Suppose M is a collection (set) of sets. M is called inductive if Definition Suppose M is a collection (set) of sets. M is called inductive if a) g M, and b) if B Mß then B MÞ Then we ask: are there any inductive sets? Informally, it certainly looks like there are. For

More information

Lecture 10: F -Tests, ANOVA and R 2

Lecture 10: F -Tests, ANOVA and R 2 Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

SIMULATION - PROBLEM SET 1

SIMULATION - PROBLEM SET 1 SIMULATION - PROBLEM SET 1 " if! Ÿ B Ÿ 1. The random variable X has probability density function 0ÐBÑ œ " $ if Ÿ B Ÿ.! otherwise Using the inverse transform method of simulation, find the random observation

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

Math 3C Midterm 1 Study Guide

Math 3C Midterm 1 Study Guide Math 3C Midterm 1 Study Guide October 23, 2014 Acknowledgement I want to say thanks to Mark Kempton for letting me update this study guide for my class. General Information: The test will be held Thursday,

More information

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus

More information

Algebra Year 10. Language

Algebra Year 10. Language Algebra Year 10 Introduction In Algebra we do Maths with numbers, but some of those numbers are not known. They are represented with letters, and called unknowns, variables or, most formally, literals.

More information

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp Nonlinear Regression Summary... 1 Analysis Summary... 4 Plot of Fitted Model... 6 Response Surface Plots... 7 Analysis Options... 10 Reports... 11 Correlation Matrix... 12 Observed versus Predicted...

More information

1. Classify each number. Choose all correct answers. b. È # : (i) natural number (ii) integer (iii) rational number (iv) real number

1. Classify each number. Choose all correct answers. b. È # : (i) natural number (ii) integer (iii) rational number (iv) real number Review for Placement Test To ypass Math 1301 College Algebra Department of Computer and Mathematical Sciences University of Houston-Downtown Revised: Fall 2009 PLEASE READ THE FOLLOWING CAREFULLY: 1. The

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Chapter 19: Logistic regression

Chapter 19: Logistic regression Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog

More information

Tutorial 6: Linear Regression

Tutorial 6: Linear Regression Tutorial 6: Linear Regression Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction to Simple Linear Regression................ 1 2 Parameter Estimation and Model

More information

Multiple Linear Regression

Multiple Linear Regression Andrew Lonardelli December 20, 2013 Multiple Linear Regression 1 Table Of Contents Introduction: p.3 Multiple Linear Regression Model: p.3 Least Squares Estimation of the Parameters: p.4-5 The matrix approach

More information

MATH 1301 (College Algebra) - Final Exam Review

MATH 1301 (College Algebra) - Final Exam Review MATH 1301 (College Algebra) - Final Exam Review This review is comprehensive but should not be the only material used to study for the final exam. It should not be considered a preview of the final exam.

More information

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore Lecture - 27 Multilayer Feedforward Neural networks with Sigmoidal

More information

1. (Problem 3.4 in OLRT)

1. (Problem 3.4 in OLRT) STAT:5201 Homework 5 Solutions 1. (Problem 3.4 in OLRT) The relationship of the untransformed data is shown below. There does appear to be a decrease in adenine with increased caffeine intake. This is

More information

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means 4.1 The Need for Analytical Comparisons...the between-groups sum of squares averages the differences

More information

Regression M&M 2.3 and 10. Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables

Regression M&M 2.3 and 10. Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables MALES FEMALES Age. Tot. %-ile; weight,g Tot. %-ile; weight,g wk N. 0th 50th 90th No.

More information

2.2 Graphs of Functions

2.2 Graphs of Functions 2.2 Graphs of Functions Introduction DEFINITION domain of f, D(f) Associated with every function is a set called the domain of the function. This set influences what the graph of the function looks like.

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

Ð"Ñ + Ð"Ñ, Ð"Ñ +, +, + +, +,,

ÐÑ + ÐÑ, ÐÑ +, +, + +, +,, Handout #11 Confounding: Complete factorial experiments in incomplete blocks Blocking is one of the important principles in experimental design. In this handout we address the issue of designing complete

More information