General Linear Models
|
|
- Ashlee Taylor
- 5 years ago
- Views:
Transcription
1 General Lnear Models Revsed: 10/10/2017 Summary... 2 Analyss Summary... 6 Analyss Optons Model Coeffcents Scatterplot Table of Means Means Plot Interacton Plot Multple Range Tests Surface and Contour Plots Reports Observed versus Predcted Resdual Plots Unusual Resduals Influental Ponts MANOVA Save Results Calculatons by Statgraphcs Technologes, Inc. General Lnear Models - 1
2 Summary The General Lnear Models procedure s desgned to construct a statstcal model descrbng the mpact of one or more factors X on one or more dependent varables Y. The factors may be: 1. quanttatve or categorcal 2. crossed or nested 3. fxed or random Errors are assumed to follow a normal dstrbuton. Weghts may be suppled f a weghted least squares soluton s desred. The output ncludes a wde varety of tables and graphs, ncludng response surface plots, resdual plots, and a MANOVA f more than one dependent varable s entered. Many dfferent types of expermental studes may be analyzed usng ths procedure. It ncludes as specal cases models that can estmated by the Multple Regresson, Oneway ANOVA, Multfactor ANOVA, and Varance Components procedures. In addton, t can analyze mxed models that cannot be handled by any of the above procedures. Sample StatFolo: glm.sgp 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 2
3 Sample Data: The sample data that wll be analyzed s a repeated measures study from Mllken and Johnson (1996). In ths study, 2 expermental drugs and a control were each admnstered to 8 subjects (for a total of 24 subjects). The heart rates of the subjects were measured at 4 dfferent tmes after the drugs were admnstered. The data are contaned n the fle heartrate.sgd, a porton of whch s shown below: Subject Drug Tme Heart Rate 1 AX23 T AX23 T AX23 T AX23 T BWW9 T BWW9 T BWW9 T BWW9 T CONTROL T CONTROL T CONTROL T CONTROL T AX23 T AX23 T AX23 T AX23 T4 81 Snce each of the subjects was gven a dfferent drug, Subject s sad to be nested wthn Drug. It s a repeated measures experment snce measurements were taken for each subject-drug combnaton at multple tmes by Statgraphcs Technologes, Inc. General Lnear Models - 3
4 Data Input The frst of two data nput dalog boxes requests the names of the columns contanng the dependent varables Y and the ndependent varables X: Y: one or more numerc columns contanng the n observatons for the dependent varables Y. If more than one column s entered, separate models wll be ft for each one. In addton, a MANOVA may be requested. Categorcal factors: numerc or non-numerc columns contanng the n levels of any nonquanttatve factors X. Quanttatve factors: numerc columns contanng the n values of any quanttatve factors X. Weght: an optonal numerc column contanng n weghts w to be appled to the squared resduals when performng a weghted least squares ft. In cases where the varance of Y s known to vary, the weghts should be nversely proportonal to those varances. If nothng s specfed n ths feld, all w = 1. Select: subset selecton. In the sample study, there s one response and three categorcal factors. The second dalog box s used to specfy the model to be ft to the data: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 4
5 Factors: Each of the categorcal and quanttatve factors s assgned a letter between A and Z. Effects: The effects to be ncluded n the model are specfed usng the letters assgned to the factors. Effects are entered as follows: 1. Man effects for crossed factors - Enter a sngle letter such as A. 2. Interactons between crossed factors - Enter a term such as A*C to nclude the nteracton between factors A and C or A*B*C to specfy a 3-factor nteracton. 3. Effects of nested factors - Enter a term such as B(A) f factor B s nested wthn factor A or C(B A) f factor C s nested wthn combnatons of factors A and B. 4. Frst order effects of quanttatve factors - Enter a sngle letter such as A. 5. Second order effects of quanttatve factors - Enter a term such as A*A for the quadratc effect of A or A*B for a cross-product. Random Factors: Categorcal factors may be ether Fxed or Random. A factor s Random f ts levels consst of a random sample of levels from a populaton of possble levels. A factor s Fxed f ts levels are selected by a nonrandom process or f ts levels consst of the entre populaton of possble levels. The effects specfed on the dalog box above are: A: the man effects of Drug. Drug s a fxed factor, snce the effects of the specfc drugs tested are to be estmated by Statgraphcs Technologes, Inc. General Lnear Models - 5
6 B(A): the effects of Subject, nested wthn Drug. Subject s nested wthn Drug, snce dfferent subjects were gven each drug. Subject s also a random factor, snce the 24 subjects selected are a random sample from the populaton of nterest, whch conssts of everyone who mght take those drugs n the future. C: the man effects of Tme. Tme s a fxed factor, snce the effects at specfc tmes are to be estmated. A*C: the nteractons between Drug and Tme. Ths term wll allow the Tme effect to be dfferent for the 3 levels of Drug. Analyss Summary The Analyss Summary shows nformaton about the ftted model. The top secton of the output s shown below: General Lnear Models Number of dependent varables: 1 Number of categorcal factors: 3 Number of quanttatve factors: 0 Number of observatons: 96 Analyss of Varance for Heart Rate Source Sum of Squares Df Mean Square F-Rato P-Value Model Resdual Total (Corr.) Type III Sums of Squares Source Sum of Squares Df Mean Square F-Rato P-Value Drug Subject(Drug) Tme Drug*Tme Resdual Total (corrected) Included n the output are: Analyss of Varance: a decomposton of the sum of squares for Y nto components for the model and for the resduals. The F-test tests the statstcal sgnfcance of the model as a whole. A small P-value (less than 0.05 f operatng at the 5% sgnfcance level) ndcates that at least one factor n the model s sgnfcantly related to the dependent varable. In the current example, the model s hghly sgnfcant. Type III Sums of Squares: decomposton of the model sum of squares nto components for each factor. Based on the settngs specfed on the Analyss Optons dalog box, ether Type III or Type I sums of squares are dsplayed. Type III sums of squares test the margnal sgnfcance of each factor, assumng t was the last to be entered nto the model. Type I sums of squares test the sgnfcance of the effects n the order they were added to the model. Small P-values ndcate sgnfcant effects. In ths example, all four effects are hghly sgnfcant by Statgraphcs Technologes, Inc. General Lnear Models - 6
7 The second secton of the analyss s mportant f the experment contans any random factors. Expected Mean Squares Source EMS Drug (5)+4.0(2)+Q1 Subject(Drug) (5)+4.0(2) Tme (5)+Q2 Drug*Tme (5)+Q3 Resdual (5) F-Test Denomnators Source Df Mean Square Denomnator Drug (2) Subject(Drug) (5) Tme (5) Drug*Tme (5) Varance Components Source Estmate Subject(Drug) Resdual It ncludes: Expected Mean Squares: The expected mean square for each factor s determned usng Hartley s (1967) synthess method. The mean squares n the earler Sums of Squares table are labeled from top to bottom as (1) for Drug, (2) for Subject wthn Drug, and so on through (5) for the Resduals. A term such as Q1 ndcates a quantty unque to the factor n whch t appears. The expected mean squares are mportant n constructng proper F-tests for models nvolvng random factors. F-Test Denomnators: the mean square used as the denomnator of the F-test for each factor, together wth ts degrees of freedom and how t was determned. For example, the F-test for Drug uses mean square (2) n ts denomnator, whch equates to usng Subject(Drug) as the error term. Varance Components: for models wth random factors, estmates the varance component j of each random effect. The components are derved by equatng the mean squares wth ther expected values, whch s referred to as the method of moments. Varance components measure the varablty n the response nduced by varaton n the random factors. For example, the varance n heart rate among persons gven the same drug at the same tme s estmated to be approxmately The fnal secton of the table shows statstcs calculated from the ftted model: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 7
8 R-Squared = percent R-Squared (adjusted for d.f.) = percent Standard Error of Est. = Mean absolute error = Durbn-Watson statstc = (P=0.1049) Resdual Analyss Estmaton N 96 MSE MAE MAPE ME E-16 MPE Valdaton The output dsplays: Statstcs: summary statstcs for the ftted model, ncludng: R-squared - represents the percentage of the varablty n Y whch has been explaned by the ftted regresson model, rangng from 0% to 100%. It s calculated by: 2 SS error R % (1) SStotal For the sample data, the regresson has accounted for about 90.5% of the varablty n the heart rates. The remanng 9.5% s attrbutable to devatons from the model, whch may be due to other factors, to measurement error, or to a falure of the current model to ft the data adequately. Adjusted R-Squared the R-squared statstc, adjusted for the number of coeffcents n the model: 2 n 1 SS error R adj 1001 % (2) n p SStotal where p s the number of estmated model coeffcents. Ths value s often used to compare models wth dfferent numbers of coeffcents. Standard Error of Est. the estmated standard devaton of the resduals (the devatons around the model): MSE (3) Ths value s used to create predcton lmts for new observatons. Mean Absolute Error the average absolute value of the resduals: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 8
9 MAE n e 1 n (4) Ths value ndcates the average error n predctng the response usng the ftted model. Durbn-Watson Statstc a measure of seral correlaton n the resduals: DW n1 1 ( e e ) 1 n 1 e 2 2 (5) If the resduals vary randomly, ths value should be close to 2. A small P-value ndcates a non-random pattern n the resduals. For data recorded over tme, a small P-value could ndcate that some trend over tme has not been accounted for. In the current example, the P- value s greater than 0.05, so there s not a sgnfcant correlaton at the 5% sgnfcance level. Resdual Analyss: f a subset of the rows n the datasheet have been excluded from the analyss usng the Select feld on the data nput dalog box, the ftted model s used to make predctons of the Y values for those rows. Ths table shows statstcs on the predcton errors, defned by e y yˆ (6) Included are the mean squared error: n 2 e 1 MSE n 1 (7) the mean absolute error: n e MAE 1 n (8) the mean absolute percentage error: n 100 e / y 1 MAPE n % (9) the mean error: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 9
10 n e ME 1 n (10) and the mean percentage error: n 100 e / y 1 MPE n % (11) The valdaton statstcs can be compared to the statstcs for the ftted model to determne how well that model predcts observatons outsde of the data used to ft t. Analyss Optons Sums of Squares: the sums of squares to dsplay. Type I sums of squares measure the contrbuton of each varable to the model when added n the order ndcated. Type III sums of squares measure the margnal contrbuton of each effect, assumng t was added last by Statgraphcs Technologes, Inc. General Lnear Models - 10
11 Dsplay: f more than one dependent varable has been specfed, the varable to use when creatng plots and table that dsplay only a sngle varable. Constant n model: If ths opton s not checked, the constant term 0 wll be omtted from the model. Removng the constant term allows for regresson through the orgn. Include MANOVA: If more than one dependent varable has been specfed, checkng ths box wll cause a multvarate analyss of varance to be ncluded n the Analyss Summary. For more nformaton, see the example late n ths document. Box-Cox Transformaton: If selected, a Box-Cox transformaton wll be appled to the dependent varable(s). Box-Cox transformatons are a way of dealng wth stuatons n whch the devatons from the regresson model do not have a constant varance. You may specfy the Box-Cox parameters or request that the program automatcally fnd the optmal power. For detals, see the Box-Cox Transformatons documentaton. Factor and Error Term: the denomnator to be used for each factor when creatng an F-test. The Automatc opton causes the program to select the proper denomnator automatcally. You can overrde the program s selectons by clckng on a factor and then clckng on the desred error term. The current error terms are dsplayed n the Selectons feld. Model Coeffcents Underlyng the analyss s a lnear statstcal model of the form Y = X X p-1 X p-1 + (12) where Y s the dependent varable, the X s carry nformaton about each of the effects n the model, and the s are assumed to be normally and ndependently dstrbuted wth a mean of 0. The Model Coeffcents pane dsplays the estmated coeffcents, standard errors, lower and upper confdence lmts, and varance nflaton factors: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 11
12 95.0% confdence ntervals for coeffcent estmates (Heart Rate) Standard Parameter Estmate Error Lower Lmt Upper Lmt V.I.F. CONSTANT Drug Drug Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Subject(Drug) Tme Tme Tme Drug*Tme Drug*Tme Drug*Tme Drug*Tme Drug*Tme Drug*Tme The model can get qute complcated, partcularly when categorcal factors are nvolved. It ncludes a term for each degree of freedom assocated wth the effects. Except n smple cases, t s not expected that the user wll calculate values usng the model, snce the Reports pane wll create predctons for any combnaton of the factors. Parameter: the estmated model coeffcents. The columns of X are defned as follows: 1. Constant: X contans a columns of 1 s. 2. Man effect of a quanttatve factor: X contans the value of the ndependent varable. 3. Man effects of a categorcal factor: For a factor wth k levels, X contans k-1 ndcator varables. The frst varable equals 1 when the factor s at ts frst level, -1 when the factor s at ts last level, and 0 otherwse. The second varable equals 1 when the factor s at ts second level, -1 when the factor s at ts last level, and 0 otherwse. Etc by Statgraphcs Technologes, Inc. General Lnear Models - 12
13 4. Interactons between factors: X contans the product of the columns created for those factors. For example, the equaton for the frst subject that was gven the frst drug at the frst tme n the above table s: Tme = (1) (1) (1) (1) = The equaton for the frst subject that was gven the last drug at the frst tme s: Tme = (-1) (-1) (1) (1) (-1) (-1) = Standard errors: estmated standard errors for each of the model coeffcents. Confdence Lmts: two-sded confdence lmts or one-sded confdence bounds for the model coeffcents. V.I.F.: varance nflaton factors. The varance nflaton factors measure how large the varance of the coeffcents s compared to what t would be f the ndependent varables were uncorrelated. Values greater than 10.0 usually ndcate serous multcollnearty amongst the predctor varables, whch leads to mprecse estmates of the model coeffcents. Pane Optons Type of Interval: Select ether two-sded confdence lmts or one-sded confdence bounds. Confdence Level: percentage used for the lmts or bounds. Show Correlatons: If selected, a table of estmated correlatons between the model coeffcents wll be dsplayed. Ths table can be helpful n determnng how well the effects of dfferent ndependent varables have been separated from each other by Statgraphcs Technologes, Inc. General Lnear Models - 13
14 Heart Rate Scatterplot The Scatterplot plots the observatons versus any one of the selected factors Scatterplot for Heart Rate AX23 BWW9 CONTROL Drug If s often helpful to jtter the ponts n the horzontal drecton by pressng the Jtter button on the analyss toolbar, as n the above plot. Jtterng offsets each pont by a random amount to prevent the ponts from fallng exactly on top of each other. Pane Optons Plot versus: the factor to plot on the horzontal axs by Statgraphcs Technologes, Inc. General Lnear Models - 14
15 Table of Means Ths table dsplays the least squares means for each level of the factors and for pars of levels for any ncluded two-factor nteractons. Least squares means represent the predcted mean value of Y at a specfed level of a categorcal factor X when all quanttatve varables are set equal to ther observed means and all ndcator varables for other categorcal factors are set equal to 0. Each mean s shown together wth ts estmated standard error and a confdence nterval: Table of Least Squares Means for Heart Rate wth 95.0 Percent Confdence Intervals Stnd. Lower Upper Level Count Mean Error Lmt Lmt GRAND MEAN Drug AX BWW CONTROL Subject wthn Drug 1 AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL AX BWW CONTROL Tme T T T T Drug by Tme AX23 T AX23 T AX23 T AX23 T BWW9 T BWW9 T BWW9 T BWW9 T CONTROL T CONTROL T CONTROL T CONTROL T by Statgraphcs Technologes, Inc. General Lnear Models - 15
16 Heart Rate For example, the mean heart rate of subjects gven drug AX23 at tme T1 s estmated to be between 68.6 and 72.4, wth 95% confdence. Pane Optons Confdence Level: the level of confdence assocated wth each nterval. Means Plot The level means for a selected factor may be plotted usng the Means Plot Means and 95.0 Percent Tukey HSD Intervals AX23 BWW9 CONTROL Drug If the factor plotted on the horzontal axs s categorcal, then the plot shows the least squares means wth uncertanty ntervals. The type of nterval dsplayed depends on the settngs n Pane Optons. If the factor on the horzontal axs s quanttatve, the plot dsplays the ftted model wth all other quanttatve factors set equal to ther observed means and all categorcal ndcator varables set equal to 0. Provded all of the sample szes are the same (or close), the analyst can determne whch level means of a categorcal factor are sgnfcantly dfferent from whch others usng the LSD, Tukey, Scheffe, or Bonferron procedure smply by lookng at whether or not a par of ntervals overlap n the vertcal drecton. A par of ntervals that do not overlap ndcates a statstcally sgnfcant dfference between the means at the selected confdence level. In ths case, note that the nterval for drug BWW9 does not overlap the nterval for CONTROL, ndcatng a statstcally 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 16
17 sgnfcant dfference between the means at those two levels. The ntervals for AX23 and CONTROL overlap, however, so they cannot be declared to be sgnfcantly dfferent by Statgraphcs Technologes, Inc. General Lnear Models - 17
18 Pane Optons Intervals: the method used to construct the ntervals. Factor: the factor to be plotted. Confdence Level: the level of confdence assocated wth each nterval. The type of ntervals that may be selected are: Confdence ntervals - dsplays confdence ntervals for the level means usng the estmated standard errors. LSD ntervals - desgned to compare any par of means wth the stated confdence level. Tukey HSD Intervals - desgned for comparng all pars of means. The stated confdence level apples to the entre famly of parwse comparsons. Scheffe Intervals - desgned for comparng all contrasts. Not usually relevant here. Bonferron Intervals - desgned for comparng a selected number of contrasts. Tukey s ntervals are usually tghter. Each of the ntervals s formed by addng a multple of the standard error of the least squares mean to the estmated mean. The multple depends upon the method used, as descrbed n the Oneway ANOVA documentaton. The degrees of freedom are those assocated wth the estmate of the standard error and depend on the structure of the experment by Statgraphcs Technologes, Inc. General Lnear Models - 18
19 Heart Rate Interacton Plot When one or more sgnfcant nteractons exst amongst the categorcal factors, the factors nvolved should be examned together usng the Interacton Plot. Interacton Plot Drug AX23 BWW9 CONTROL 70 T1 T2 T3 T4 Tme The nteracton plot dsplays the least squares means at all combnatons of two factors. If the factors do not nteract, the lnes on the plot should be approxmately parallel. If they are not, then the effect of one factor depends upon the level of the other, whch s the defnton of an nteracton. Notce that the heart rate for the CONTROL group changes very lttle over tme, whle t shows sgnfcant changes for the other two drugs. In addton, drug BWW9 appears to have a qucker and more sustaned effect than drug AX23. Pane Optons 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 19
20 Interval: type of nterval to be drawn around each mean. The nteracton s treated as a factor wth number of levels equal to the total number of plotted ponts. Interacton: nteracton to plot. Confdence Level: percentage used to defne the ntervals. Plot on Axs: the factor used to defne ponts along the horzontal axs. Lnes wll be drawn at each level of the other factor. Multple Range Tests For factors that shows sgnfcant P-Values n the ANOVA table and whch do not nteract wth other factors, a further analyss can be performed by selectng the Multple Range Tests. Multple Comparsons for Heart Rate by Drug Method: 95.0 percent LSD Drug Count LS Mean LS Sgma Homogeneous Groups CONTROL X AX XX BWW X Contrast Sg. Dfference +/- Lmts AX23 - BWW AX23 - CONTROL BWW9 - CONTROL * * denotes a statstcally sgnfcant dfference. The top half of the table dsplays each of the estmated least squares means n ncreasng order of magntude. It shows: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 20
21 Count - the number of observatons at the specfed level of the factor. LS Mean - the estmated least squares mean. In the case of a balanced desgn, the least squares mean s equvalent to the average of all observatons at the ndcated factor level. In unbalanced desgns, the least squares mean s the predcted value of the dependent varable when the specfed factor s set to a partcular level whle all other factors are set equal to ther mean levels. The least squares means adjust for any mbalance n the data by makng predctons at a common level of all the factors. LS Sgma the estmated standard error of the least squares mean. Homogeneous groups - a graphcal llustraton of whch means are sgnfcantly dfferent from whch others, based on the contrasts dsplayed n the second half of the table. Each column of X s ndcates a group of means wthn whch there are no statstcally sgnfcant dfferences. In the example, there are 2 columns, each contanng a par of X s. It ndcates that drug AX23 s not sgnfcantly dfferent from ether the CONTROL or from drug BWW9. However, snce CONTROL and BWW9 are not wthn the same group anywhere, ther means are sgnfcantly dfferent. The second half of the table dsplays a comparson between each par of level means. Dfference - the dfference between the two least squares means. Lmts - an nterval estmate of that dfference, usng the currently selected multple comparsons procedure. Sg. - An astersk s placed next to any dfference that s statstcally sgnfcantly dfferent from 0 at the currently selected sgnfcance level,.e., any nterval that does not contan 0. Pane Optons 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 21
22 Type: type of contrasts to be created. Factor: factor to be analyzed. Method: the method used to make the multple comparsons. Control Level: f Type s set to Versus Control, the number of the level aganst whch all other levels wll be compared. Confdence Level: the level of confdence used by the selected multple comparson procedure. The avalable methods are: LSD - forms a confdence nterval for each par of means at the selected confdence level usng Student s t dstrbuton. Ths procedure s due to Fsher and s called the Least Sgnfcant Dfference procedure, snce the magntude of the lmts ndcates the smallest dfference between any two means that can be declared to represent a statstcally sgnfcant dfference. It should only be used when the F-test n the ANOVA table ndcates sgnfcant dfferences amongst the level means. The probablty of makng a Type I error apples to each par of means separately. If makng more than one comparson, the overall probablty of callng at least one par of means sgnfcantly dfferent when they are not may be consderably larger than. Tukey HSD - wdens the ntervals to allows for multple comparsons amongst all pars of means usng Tukey s T. Tukey called hs procedure the Honestly Sgnfcant Dfference procedure snce t controls the experment-wde error rate at. If all of the means are equal, the probablty of declarng any of the pars to be sgnfcantly dfferent n the entre experment equals. Tukey s procedure s more conservatve than Fsher s LSD procedure, snce t makes t harder to declare any partcular par of means to be sgnfcantly dfferent. Scheffe - desgned to permt the estmaton of all possble contrasts amongst the sample means (not just parwse comparsons). Bonferron - desgned to permt the estmaton of any preselected number of contrasts. These lmts are usually wder than Tukey s lmts when all parwse comparsons are beng made. Multvarate t desgned for sets of lnearly ndependent combnatons of the means. Student-Newman-Keuls - Unlke the prevous methods, ths method does not create ntervals for the parwse dfferences. Instead, t sorts the means n ncreasng order and then begns to separate them nto groups accordng to values of the Studentzed range dstrbuton. Eventually, the means are separated nto homogeneous groups wthn whch there are no sgnfcant dfferences by Statgraphcs Technologes, Inc. General Lnear Models - 22
23 Duncan - smlar to the Student-Newman-Keuls procedure, except that t uses a dfferent crtcal value of the Studentzed range dstrbuton when defnng the homogeneous groups. A detaled dscusson of the Duncan and Student-Newman-Keuls procedures s gven by Mllken and Johnson (1992). Dunnett desgned for parwse comparsons when one level s a control. Example User-Specfed Contrasts User-specfed contrasts may be tested by settng Type to User-Specfed. When OK s pressed, a small datasheet wll be dsplayed on whch to defne the contrasts. Each row of the datasheet specfes the coeffcents n the contrast c c c k k (13) where the coeffcents c j must sum to 1. For example, the datasheet below defnes a contrast of the form (14) whch contrasts the average response of the two expermental drugs to the control. The resultng output dsplays each least squares mean and an nterval estmate for the contrasts: Multple Comparsons for Heart Rate by Drug Method: 95.0 percent LSD Drug Count LS Mean AX BWW CONTROL by Statgraphcs Technologes, Inc. General Lnear Models - 23
24 MPG Hghway Contrast Sg. Estmate +/- Lmts * * denotes a statstcally sgnfcant estmate. If LSD s selected, the +/- Lmts correspond to 95% confdence ntervals for the desred contrasts. Surface and Contour Plots If the model nvolves at least two quanttatve factors, surface and contour plots can be created. For example, usng the 93cars.sf6 dataset, the followng plots dsplays a model for MPG Hghway as a functon of the Length and Wdth of the automobles n that fle. Estmated Response Surface Length Wdth The ftted model ncludes the man effects of both factors together wth ther nteracton. Lnes have been dropped from each pont perpendcularly to the estmated model. Pane Optons 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 24
25 Type: type of plot to dsplay. The ftted model may be plotted as a 3-D Surface plot, a 2-D Contour plot, at each corner of a square, or at each corner of a cube (gven at least 3 quanttatve factors). Contours From, To, and By: defnes the contour regons when contours are added to the plot. The contours may be drawn as sold Lnes, Panted Regons of sold color, usng a Contnuous range of colors, or usng as Contnuous wth grd. Resoluton: the number of X and Y locatons at whch the functon s evaluated when creatng the plot. A larger resoluton results n a smoother plot. You can set the default resoluton usng the Preferences selecton on the Edt menu. Surface Horzontal and Vertcal Dvsons: the number of ntervals between the grd lnes along the X and Y axes. Contours Below: draws contours n the base of the cube when creatng a surface plot. Draw Ponts: plots each observaton and drops a vertcal lne to the surface. Type: the type of surface to be drawn: o Wre frame: a surface defned by grd lnes only. o Sold: a surface defned by grd lnes wth a sold color between the lnes. o Contoured: a surface wth colored regons showng the value of the functon by Statgraphcs Technologes, Inc. General Lnear Models - 25
26 o 3-D contours: a cube defned by 3 factors wth contours on 3 faces. o 3-D mesh plot: a cube n whch the response s evaluated on a mesh throughout the cube. Factors: Press ths button to set the lmts for the factors on the plot and the values at whch to fx other factors. The followng dalog box wll be dsplayed: Low and Hgh: plottng lmts for the selected factors. Hold: values to fx other factors at when evaluatng the ftted model by Statgraphcs Technologes, Inc. General Lnear Models - 26
27 Wdth MPG Hghway Example: Surface Plot wth Contnuous Contours Below Estmated Response Surface Length Wdth MPG Hghway Example: Square Plot Square Plot for MPG Hghway Length The values dsplayed at each corner of the square are the predcted values Ŷ. Reports The Reports pane dsplays predctons from the ftted least squares model. By default, the table ncludes a lne for each row n the datasheet that has complete nformaton on the X varables and a mssng value for the Y varable. Ths allows you to add rows to the bottom of the datasheet correspondng to levels at whch you want predctons wthout affectng the ftted model. For example, suppose you wshed to dsplay the estmated values for each of the two expermental drugs at the four tme perods. Addtonal rows would be added to the bottom of the datasheet as follows: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 27
28 Row Subject Drug Tme Heart Rate 97 0 AX23 T AX23 T AX23 T AX23 T BWW9 T BWW9 T BWW9 T BWW9 T4 Subject s set to 0 so that all of the ndcator varables for that factor wll be set to 0, effectvely averagng across all subjects. The resultng table s shown below: Regresson Results for Heart Rate Ftted Stnd. Error Lower 95.0% CL Upper 95.0% CL Lower 95.0% CL Upper 95.0% CL Row Value for Forecast for Forecast for Forecast for Mean for Mean The table dsplays: Row - the row number n the datasheet. Ftted Value - the predcted value of the dependent varable Ŷ usng the ftted model. Standard Error for Forecast - the estmated standard error for predctng a sngle new observaton. Confdence Lmts for Forecast - predcton lmts for new observatons at the selected level of confdence. Confdence Lmts for Mean - confdence lmts for the mean value of Y at the selected level of confdence. For example, an addtonal subject gven drug BWW9 s lkely to have a heart rate at tme T1 between 76.0 and 87.5 (row #101). The 95% confdence nterval for the mean heart rate of many subjects gven that drug at that tme runs from 79.8 to by Statgraphcs Technologes, Inc. General Lnear Models - 28
29 observed Pane Optons You may nclude: Observed Y the observed values of the dependent varable. Ftted Y the predcted values from the ftted model. Resduals the ordnary resduals (observed mnus predcted). Studentzed Resduals the Studentzed deleted resduals as descrbed below. Standard Errors for Forecasts the standard errors for new observatons at values of the ndependent varables correspondng to each row of the datasheet. Confdence Lmts for Indvdual Forecasts confdence ntervals for new observatons. Confdence Lmts for Forecast Means confdence ntervals for the mean value of Y at values of the ndependent varables correspondng to each row of the datasheet. Observed versus Predcted The Observed versus Predcted plot shows the observed values of Y on the vertcal axs and the predcted values Ŷ on the horzontal axs Plot of Heart Rate predcted 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 29
30 Studentzed resdual If the model fts well, the ponts should be randomly scattered around the dagonal lne. Any change n varablty from low values of Y to hgh values of Y mght ndcate the need to transform the dependent varable before fttng a model to the data. Resdual Plots As wth all statstcal models, t s good practce to examne the resduals. In a regresson, the resduals are defned by e y yˆ (15).e., the resduals are the dfferences between the observed data values and the ftted model. The General Lnear Models procedure creates varous types of resdual plots, dependng on the settngs n Pane Optons. Scatterplot versus Predcted Values Ths plot s helpful n vsualzng any possble dependence of the resdual varance on the mean, whch mght necesstate a weghted least squares ft. 4.6 Resdual Plot predcted Heart Rate The above plot shows a farly constant varance, although one possble outler s evdent. Normal Probablty Plot Ths plot can be used to determne whether or not the devatons around the lne follow a normal dstrbuton, whch s the assumpton used to form the predcton ntervals by Statgraphcs Technologes, Inc. General Lnear Models - 30
31 autocorrelaton percentage Normal Probablty Plot for Heart Rate resdual If the devatons follow a normal dstrbuton, they should fall approxmately along a straght lne. In the above plot, the ponts fall farly close to the lne. Resdual Autocorrelatons Ths plot calculates the autocorrelaton between resduals as a functon of the number of rows between them n the datasheet. 1 Resdual Autocorrelatons for Heart Rate lag It s only relevant f the data have been collected sequentally. Any bars extendng beyond the probablty lmts would ndcate sgnfcant dependence between resduals separated by the ndcated lag, whch would volate the assumpton of ndependence made when fttng the regresson model. Pane Optons 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 31
32 Plot: the type of resduals to plot: 1. Resduals the resduals from the least squares ft. 2. Studentzed resduals the dfference between the observed values y and the predcted values ŷ when the model s ft usng all observatons except the -th, dvded by ther estmated standard error. These resduals are sometmes called externally deleted resduals, snce they measure how far each value s from the ftted model when that model s ft usng all of the data except the pont beng consdered. Ths s mportant, snce a large outler mght otherwse affect the model so much that t would not appear to be unusually far away from the lne. Type: the type of plot to be created. A Scatterplot s used to test for curvature. A Normal Probablty Plot s used to determne whether the model resduals come from a normal dstrbuton. An Autocorrelaton Functon s used to test for dependence between consecutve resduals. Plot Versus: for a Scatterplot, the quantty to plot on the horzontal axs. Number of Lags: for an Autocorrelaton Functon, the maxmum number of lags. For small data sets, the number of lags plotted may be less than ths value. Confdence Level: for an Autocorrelaton Functon, the level used to create the probablty lmts by Statgraphcs Technologes, Inc. General Lnear Models - 32
33 Unusual Resduals Once the model has been ft, t s useful to study the resduals to determne whether any outlers exst that should be removed from the data. The Unusual Resduals pane lsts all observatons that have Studentzed resduals of 2.0 or greater n absolute value. Unusual Resduals for Heart Rate Predcted Studentzed Row Y Y Resdual Resdual Studentzed resduals greater than 3 n absolute value correspond to ponts more than 3 standard devatons from the ftted model, whch s a rare event for a normal dstrbuton. Row #24 s more than 3.3 standard devatons from the ftted model, whch s a very rare event f the devatons follow a normal dstrbuton. Note: Ponts can be removed from the ft whle examnng the Scatterplot by clckng on a pont and then pressng the Exclude/Include button on the analyss toolbar. Excluded ponts are marked wth an X by Statgraphcs Technologes, Inc. General Lnear Models - 33
34 Influental Ponts In fttng a regresson model, all observatons do not have an equal nfluence on the parameter estmates n the ftted model. Ponts located at extreme values of X have greater nfluence than those located nearer to the center of the expermental regon. The Influental Ponts pane dsplays any observatons that have hgh nfluence on the ftted model: Influental Ponts for Heart Rate Mahalanobs Cook's Row Leverage Dstance DFITS Dstance Average leverage of sngle data pont = Ponts are placed on ths lst for one of the followng reasons: Leverage measures how dstant an observaton s from the mean of all n observatons n the space of the ndependent varables. The hgher the leverage, the greater the mpact of the pont on the ftted values ŷ. Ponts are placed on the lst f ther leverage s more than 3 tmes that of an average data pont. Mahalanobs Dstance measures the dstance of a pont from the center of the collecton of ponts n the multvarate space of the ndependent varables. Snce ths dstance s related to leverage, t s not used to select ponts for the table. DFITS measures the dfference between the predcted values ŷ when the model s ft wth and wthout the -th data pont. Ponts are placed on the lst f the absolute value of DFITS exceeds 2 p / n, where p s the number of coeffcents n the ftted model. Cook s Dstance an overall measure of the nfluence of the -th observaton on the estmated coeffcents. Ponts are placed on the lst f the value s beyond the 50 th percentle of an F dstrbuton wth p and n p degrees of freedom. Because of the perfect balance n ths desgn, all leverage values are equal. However, 9 ponts made the lst because of a large value of DFITS, ncludng all of the ponts prevously dentfed as large resduals by Statgraphcs Technologes, Inc. General Lnear Models - 34
35 MANOVA When more than one dependent varable s specfed on the data nput dalog box, a multvarate analyss of varance may be ncluded f requested usng Analyss Optons. For example, consder the data from an experment reported by Johnson and Wchern (2002) performed to determne the optmal condtons for extrudng plastc flm. Three response varables, Tear resstance, Gloss, and Opacty were measured at dfferent levels of two factors, Rate of Extruson and Amount of addtve. The data s contaned n the fle flm.sf6: Rate of Amount of Tear Gloss Opacty Extruson Addtve Resstance The data nput dalog box specfes the names of the three response varables and the two factors: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 35
36 Snce the factors are each at only two levels, they can be entered as ether categorcal factors or quanttatve factors. The specfed model ncludes both man effects and a two-factor nteracton: 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 36
37 For multple dependent varables, the Analyss Summary ncludes separate analyses for each response. If requested on the Analyss Optons dalog box, a MANOVA wll also be performed. The addtonal output from that analyss s shown below: MANOVA for A Wlks' lambda = F = P-value = Plla trace = F = P-value = Hotellng-Lawley trace = F = P-value = Roy's greatest root = s = 1 m = 0.5 n = 6.0 Hypothess Matrx H Tear resstance Gloss Opacty Tear resstance Gloss Opacty Error Matrx E Tear resstance Gloss Opacty Tear resstance Gloss Opacty MANOVA for B Wlks' lambda = F = P-value = Plla trace = F = P-value = Hotellng-Lawley trace = F = P-value = Roy's greatest root = s = 1 m = 0.5 n = 6.0 Hypothess Matrx H Tear resstance Gloss Opacty Tear resstance Gloss Opacty Error Matrx E Tear resstance Gloss Opacty Tear resstance Gloss Opacty MANOVA for A*B Wlks' lambda = F = P-value = Plla trace = F = P-value = Hotellng-Lawley trace = F = P-value = Roy's greatest root = s = 1 m = 0.5 n = 6.0 Hypothess Matrx H Tear resstance Gloss Opacty Tear resstance Gloss Opacty Error Matrx E Tear resstance Gloss Opacty Tear resstance Gloss Opacty For each effect, the table shows four statstcs desgned to test whether or not there are sgnfcant overall effects due to that factor. The statstcs are based on the matrces of sums of 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 37
38 squares and cross-products attrbutable to the hypotheszed effects (H) and to the resduals (E). The statstcs dsplayed are: Wlks lambda: a statstc based on the rato of two determnants * E E H Plla Trace: a statstc calculated from tr H H E 1 Hotellng-Lawley Trace: a statstc calculated from tr HE 1 Roy s Greatest Root: a statstc equal to where 1 s the largest egenvalue of HE -1. The output lne for Roy s statstc also dsplays the values of s, m, and n, three values used to calculate the F tests for the other statstcs. It s worth notng that the tests are exact f s = 1 or 2 and approxmate otherwse. The frst three statstcs are shown together wth the results of F-tests. Small P-Values (less than 0.05 f operatng at the 5% sgnfcance level) ndcate sgnfcant effects.in the example, the man effects of both 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 38
39 Save Results The followng results may be saved to the datasheet: 1. Predcted Values the predcted value of Y correspondng to each of the n observatons. 2. Standard Errors of Predctons - the standard errors for the n predcted values. 3. Lower Lmts for Predctons the lower predcton lmts for each predcted value. 4. Upper Lmts for Predctons the upper predcton lmts for each predcted value. 5. Standard Errors of Means - the standard errors for the mean value of Y at each of the n values of X. 6. Lower Lmts for Forecast Means the lower confdence lmts for the mean value of Y at each of the n values of X. 7. Upper Lmts for Forecast Means the upper confdence lmts for the mean value of Y at each of the n values of X. 8. Resduals the n resduals. 9. Studentzed Resduals the n Studentzed resduals. 10. Leverages the leverage values correspondng to the n values of X. 11. DFITS Statstcs the value of the DFITS statstc correspondng to the n values of X. 12. Mahalanobs Dstances the Mahalanobs dstance correspondng to the n values of X. 13. Cook s Dstances Cook s dstance correspondng to the n values of X. 14. Coeffcents the estmated model coeffcents by Statgraphcs Technologes, Inc. General Lnear Models - 39
40 Calculatons Regresson Model Y X X p1 p1 X Error Sum of Squares Unweghted: ˆ 2 SSE n y ˆ ˆ x ˆ x... x 1 0 Weghted: ˆ 2 SSE n w y ˆ ˆ x ˆ x... x Coeffcent Estmates ˆ 1 1 X WX X WY MSEX WX 1 s 2 ˆ SSE MSE n p where ˆ s a column vector contanng the estmated regresson coeffcents, X s an (n, p) matrx contanng a 1 n the frst column (f the model contans a constant term) and the settngs of the predctor varables n the other columns, Y s a column vector wth the values of the dependent varable, and W s an (n, n) dagonal matrx contanng the weghts w on the dagonal for a weghted regresson or 1 s on the dagonal f weghts are not specfed. A modfed sweep algorthm s used to solve the equatons after centerng and rescalng of the ndependent varables p1 p1 p1 p by Statgraphcs Technologes, Inc. General Lnear Models - 40
41 Analyss of Varance Wth constant term: Source Sum of Squares Df Mean Square Model n 2 SSR w y p-1 MSR p 1 1 SSR bx WY n w 1 F-Rato MSR F MSE Resdual Total (corr.) SSE SSE YWY bx WY n-p MSE n p 2 y y SSTO w n-1 1 Wthout constant term: Source Sum of Squares Df Mean Square Model SSR bx WY p SSR MSR p Resdual SSE YWY bx WY n-p SSE MSE n p F-Rato MSR F MSE Total SSTO YWY n R-Squared 2 SSR R 100 % (26) SSR SSE Adjusted R-Squared 2 n 1 SSE R adj 1001 % (27) n p SSR SSE 2017 by Statgraphcs Technologes, Inc. General Lnear Models - 41
42 Stnd.Error of Est. ˆ MSE (28) Resduals e y ˆ ˆ x... ˆ x (29) o 1 1 p1 p1 Mean Absolute Error MAE n 1 n w 1 w e (30) Durbn-Watson Statstc D n 2 e e n 1 e (31) If n > 500, then D 2 * D (32) 4 / n s compared to a standard normal dstrbuton. For 100 < n 500, D/4 s compared to a beta dstrbuton wth parameters n 1 (33) 2 For smaller sample szes, D/4 s compared to a beta dstrbuton wth parameters whch are based on the trace of certan matrces related to the X matrx, as descrbed by Durbn and Watson (1951) n secton 4 of ther classc paper by Statgraphcs Technologes, Inc. General Lnear Models - 42
Comparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationBasic Business Statistics, 10/e
Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationChapter 14 Simple Linear Regression
Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationSPANC -- SPlitpole ANalysis Code User Manual
Functonal Descrpton of Code SPANC -- SPltpole ANalyss Code User Manual Author: Dale Vsser Date: 14 January 00 Spanc s a code created by Dale Vsser for easer calbratons of poston spectra from magnetc spectrometer
More informationChapter 15 - Multiple Regression
Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term
More informationY = β 0 + β 1 X 1 + β 2 X β k X k + ε
Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty
More informationANOVA. The Observations y ij
ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study
More informationStatistics MINITAB - Lab 2
Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that
More informationLecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management
Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an
More informationLecture 6 More on Complete Randomized Block Design (RBD)
Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For
More informationSTATISTICS QUESTIONS. Step by Step Solutions.
STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to
More informationTopic- 11 The Analysis of Variance
Topc- 11 The Analyss of Varance Expermental Desgn The samplng plan or expermental desgn determnes the way that a sample s selected. In an observatonal study, the expermenter observes data that already
More informationNumber of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k
ANOVA Model and Matrx Computatons Notaton The followng notaton s used throughout ths chapter unless otherwse stated: N F CN Y Z j w W Number of cases Number of factors Number of covarates Number of levels
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables
More informationEconomics 130. Lecture 4 Simple Linear Regression Continued
Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do
More information7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA
Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power
More informationexperimenteel en correlationeel onderzoek
expermenteel en correlatoneel onderzoek lecture 6: one-way analyss of varance Leary. Introducton to Behavoral Research Methods. pages 246 271 (chapters 10 and 11): conceptual statstcs Moore, McCabe, and
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More information2016 Wiley. Study Session 2: Ethical and Professional Standards Application
6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton
More information[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.
PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationChap 10: Diagnostics, p384
Chap 10: Dagnostcs, p384 Multcollnearty 10.5 p406 Defnton Multcollnearty exsts when two or more ndependent varables used n regresson are moderately or hghly correlated. - when multcollnearty exsts, regresson
More informationChapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2
Chapter 4 Smple Lnear Regresson Page. Introducton to regresson analyss 4- The Regresson Equaton. Lnear Functons 4-4 3. Estmaton and nterpretaton of model parameters 4-6 4. Inference on the model parameters
More informationCathy Walker March 5, 2010
Cathy Walker March 5, 010 Part : Problem Set 1. What s the level of measurement for the followng varables? a) SAT scores b) Number of tests or quzzes n statstcal course c) Acres of land devoted to corn
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours
UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x
More informationCorrelation and Regression
Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and
More informationStatistics for Business and Economics
Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationIntroduction to Regression
Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes
More information28. SIMPLE LINEAR REGRESSION III
8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted
More informationUnit 10: Simple Linear Regression and Correlation
Unt 10: Smple Lnear Regresson and Correlaton Statstcs 571: Statstcal Methods Ramón V. León 6/28/2004 Unt 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regresson analyss s a method for studyng the
More informationChapter 15 Student Lecture Notes 15-1
Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons
More information18. SIMPLE LINEAR REGRESSION III
8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.
More informationChapter 6. Supplemental Text Material
Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More information/ n ) are compared. The logic is: if the two
STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence
More informationx yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.
The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator
More informationChapter 3 Describing Data Using Numerical Measures
Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationis the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors
Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson
More informationNANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis
NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons
More informationwhere I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).
11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e
More informationCorrelation and Regression. Correlation 9.1. Correlation. Chapter 9
Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,
More informationReminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1
Lecture 9: Interactons, Quadratc terms and Splnes An Manchakul amancha@jhsph.edu 3 Aprl 7 Remnder: Nested models Parent model contans one set of varables Extended model adds one or more new varables to
More informationSIMPLE LINEAR REGRESSION
Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two
More informationUCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,
UCLA STAT 3 ntroducton to Statstcal Methods for the Lfe and Health Scences nstructor: vo Dnov, Asst. Prof. of Statstcs and Neurology Chapter Analyss of Varance - ANOVA Teachng Assstants: Fred Phoa, Anwer
More informationStatistics II Final Exam 26/6/18
Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationInterval Estimation in the Classical Normal Linear Regression Model. 1. Introduction
ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model
More informationLearning Objectives for Chapter 11
Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationStatistics Chapter 4
Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationMultiple Contrasts (Simulation)
Chapter 590 Multple Contrasts (Smulaton) Introducton Ths procedure uses smulaton to analyze the power and sgnfcance level of two multple-comparson procedures that perform two-sded hypothess tests of contrasts
More information17 - LINEAR REGRESSION II
Topc 7 Lnear Regresson II 7- Topc 7 - LINEAR REGRESSION II Testng and Estmaton Inferences about β Recall that we estmate Yˆ ˆ β + ˆ βx. 0 μ Y X x β0 + βx usng To estmate σ σ squared error Y X x ε s ε we
More informationF statistic = s2 1 s 2 ( F for Fisher )
Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the
More informationChapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout
Serk Sagtov, Chalmers and GU, February 0, 018 Chapter 1. Analyss of varance Chapter 11: I = samples ndependent samples pared samples Chapter 1: I 3 samples of equal sze one-way layout two-way layout 1
More informationThe Ordinary Least Squares (OLS) Estimator
The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal
More informationRegression Analysis. Regression Analysis
Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y
More informationFirst Year Examination Department of Statistics, University of Florida
Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve
More informationThis column is a continuation of our previous column
Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationJanuary Examinations 2015
24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)
More information[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact
Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β
More informationSTAT 511 FINAL EXAM NAME Spring 2001
STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte
More informationHere is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)
Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationDurban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications
Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department
More informationDO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes
25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationResource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis
Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationBETWEEN-PARTICIPANTS EXPERIMENTAL DESIGNS
1 BETWEEN-PARTICIPANTS EXPERIMENTAL DESIGNS I. Sngle-factor desgns: the model s: y j = µ + α + ε j = µ + ε j where: y j jth observaton n the sample from the th populaton ( = 1,..., I; j = 1,..., n ) µ
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.
ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More informationJoint Statistical Meetings - Biopharmaceutical Section
Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 3.
More informationPHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University
PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)
More informationTopic 23 - Randomized Complete Block Designs (RCBD)
Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,
More information