On the detection of influential outliers in linear regression analysis

Size: px
Start display at page:

Download "On the detection of influential outliers in linear regression analysis"

Transcription

1 Amercan Journal of Theoretcal and Appled Statstcs 04; 3(4): Publshed onlne July 30, 04 ( do: 0.648/j.ajtas ISSN: (Prnt); ISSN: (Onlne) On the detecton of nfluental outlers n lnear regresson analyss Armyaw Zakara, Nathanel Kwamna Howard, Bsmark Kwao Nkansah * Department of Mathematcs and Statstcs, Unversty of Cape Coast, Cape Coast, Ghana Emal address: bnkansah@ucc.edu.gh (B. K. Nkansah), zzarmyaw@yahoo.com (A. Zakara), nathoward965@yahoo.co.uk (N. K. Howard) To cte ths artcle: Armyaw Zakara, Nathanel Kwamna Howard, Bsmark Kwao Nkansah. On the Detecton of Influental Outlers n Lnear egresson Analyss. Amercan Journal of Theoretcal and Appled Statstcs. Vol. 3, No. 4, 04, pp do: 0.648/j.ajtas Abstract: In ths paper, we propose a measure for detectng nfluental outlers n lnear regresson analyss. The performance of the proposed method, called the Coeffcent of Determnaton ato (CD), s then compared wth some standard measures of nfluence, namely: Cook s dstance, studentsed deleted resduals, leverage values, covarance rato, and dfference n fts standardzed. Two exstng datasets, one artfcal and one real, are employed for the comparson and to llustrate the effcency of the proposed measure. It s observed that the proposed measure appears more responsve to detectng nfluental outlers n both smple and multple lnear regresson analyses. The CD thus provdes a useful alternatve to exstng methods for detectng outlers n structured datasets. Keywords: Coeffcent of Determnaton ato, Cook s Dstance, DFFITS, CV, Studentsed Deleted esduals, Leverage Values. Introducton An outler n a set of data s defned to be an observaton (or subset of observatons) whch appears to be nconsstent wth the remander of that set of data []. Outlers may represent data that are contamnated n some way (e.g., a recordng error, an error n the expermental procedure), or they may represent an accurate observaton of a rare case []. It s well known that snce the effect of outlyng observatons on parameter estmates and on nferences about models and ther sutablty are to be expected, studes on outlers would help to reduce ther nfluence. Outler dentfcaton s done relatve to a specfed model. If the form of the model s modfed, the status of ndvdual observatons as outlers may change [3]. Consequentally, when outlers are present n a dataset, t leads to msleadng results. egresson analyss, as we know, s one of the most mportant statstcal technques for model fttng. If a regresson model s approprately selected, most observatons should be farly close to the regresson lne or hyperplane. The observatons whch are far away from the regresson lne or hyperplane may not be deal observatons for the selected model and could potentally be dentfed as the outlers for the model. The least squares method s undoubtedly the most popular parameter estmaton technque, manly due to ts computatonal smplcty and underlyng optmal propertes [4, 5]. It s well known that nferences based on least squares regresson can be strongly nfluenced by only a few observatons n the data, and the ftted model may reflect unusual features of those observatons rather than the overall relatonshp between the varables, [6]. Several technques have been developed for detectng problems wth dataset n regresson analyss. They dffer n the partcular regresson result on whch the effect of a deleton of an observaton s measured. For nstance, the Cook s dstance measures the effect of observatons on the estmated regresson coeffcents. Other measures of nfluence measure the effect of observatons on the ftted values and the varance-covarance of the parameter estmates. Ths paper s another attempt at dentfyng a more responsve measure for detectng even the more subtle suspect outlers. In ths secton, we provde a bref revew of the measures that are used for detectng nfluental observatons n structured data. Then n the next secton, we wll propose an alternatve measure for outler detecton. The thrd secton then compares the proposed measure wth the standard ones usng some datasets.

2 Amercan Journal of Theoretcal and Appled Statstcs 04; 3(4): evew of Measures of Influence In ths secton, we dscuss some standard measures of nfluence. These measures are the leverage value, studentsed deleted resduals, Cook s dstance, DFFITS, and the Covarance ato.... Leverage Values Leverage values are employed to dentfy outlers wth respect to ther x values. Ths value s a measure of the dstance between the observaton s x values and the centre of the data. If the leverage value of an observaton s large, the observaton s outlyng wth respect to ts x values. The dagonal elements of the hat matrx (called leverage values) are a useful ndcator of whether or not an observaton s outlyng wth respect to ts x values. The leverage value,, h for the th observaton n the data matrx X s gven by h x x,,,, n () If the th observaton s outlyng n terms of ts x values and therefore has a large value of h, t exercses substantal weght (leverage) n determnng the ftted value y ˆ. A leverage value h s usually consdered to be large, f t s more than twce as large as the mean leverage value. That s, leverage values greater than ( k + ) n are consdered by ths rule to ndcate outlyng observatons wth regard to ther x values.... Studentsed Deleted esduals The studentsed deleted resduals, t s gven by n k t ˆ ( ) ˆ SSE h ε ε () Whch s calculated from the resduals εˆ, the error sum of squares SSE, and the hat matrx values h, all for the ftted regresson based on the n observatons. We dentfy as outlyng those observatons whose studentsed deleted resduals are large n absolute value. In addton, we can conduct a formal test by means of Bonferron test of whether the observaton wth the largest absolute studentsed deleted resdual s an outler. If the regresson model s approprate, so that no observaton s outlyng because of a change n the model, then each studentsed deleted resdual wll follow the t dstrbuton wth ( n k ) degrees of freedom. The approprate Bonferron crtcal value therefore s t ; nk α. An observaton s consdered to be outler wth respect to ts y value f. t t α ; nk..3. Cook s Dstance Cook s dstance measures the squared dstance between the least squares estmate of β based on all n observatons and the estmate, β (), obtaned when the th observaton s removed. Cook s dstance measure s an aggregate nfluence measure, showng the effect of the th observaton on all n ftted values. It s gven by D ( ˆ ˆ β β) X X( βˆ βˆ ) ( ) ( k + ) s ( ) or by a more computatonally convenent form as r h D k + where r s the squared studentsed resdual, whch reflects how well the model fts the th observaton, y. For nterpretng Cook s dstance measure, a rule of thumb s that D 4 n ( k + ) whch ndcates that the observaton s nfluental...4. DFFITS A useful measure of nfluence that observaton has on the ftted value ŷ s gven by: ( DFFITS) ( ) (3) (4) yˆ yˆ( ) (5) s h The letters DF denote the dfference between the ftted value ŷ for the th observaton when all n observatons are used n fttng the regresson functon and the correspondng predcted value y ˆ( ) obtaned when the th observaton s omtted n fttng the regresson functon. The denomnator of Equaton (5) s the estmated standard devaton of yˆ, but t uses the standard error, s (), when the th observaton s omtted n fttng the regresson functon for σ estmatng the error varance. The denomnator provdes standardzaton so that the value ( DFFITS) for the th observaton represents the amount of ncrease or decrease n the estmated standard devatons of ŷ wth ncluson of the th observaton n fttng the regresson model. It can be shown that the DFFITS values can be computed by usng only the results from the entre dataset, as follows: ( n k h ) ˆ ( ) ˆ h DFFITS t SSE h ε h h ε (6) As a gude for dentfyng nfluental observatons, t s suggested to consder an observaton as nfluental f the absolute value of DFFITS exceeds for small to medum datasets and ( k + ) n for large datasets...5. Covarance ato One can assess the nfluence of the th observaton by comparng the estmated varance of βˆ and the estmated

3 0 Armyaw Zakara et al.: On the Detecton of Influental Outlers n Lnear egresson Analyss varance of βˆ (). Mathematcally, the Covarance ato (CV) s gven by ˆ σ( ) CV ˆ σ Ideally, when all observatons have equal nfluence on the covarance matrx, CV s approxmately equal to one. Devaton from unty ndcates that the th observaton s potentally nfluental. A rough calbraton pont for Equaton (7) s CV > 3k n.. The Coeffcent of Determnaton ato The general procedure for assessng the nfluence of an observaton n a regresson analyss s to determne the changes that occur when that observaton s omtted. Several measures of nfluence have been developed usng ths concept. We now propose a measure of nfluence that s based on the value of the coeffcent of determnaton ( ) of the lnear regresson model. To formulate the proposed measure, we frst ft a lnear regresson model to the full data and determne the value. Secondly, we compute the ( ), the coeffcent of k () (7) determnaton value when the th observaton s deleted from the dataset. We then compare the values of and () by takng ther rato. Ths measure s what we refer to n ths paper as the Coeffcent of Determnaton ato (CD). The CD for the th observaton s defned as ( ) ( ) CD,,,, n (8) ( ) It has been shown (see Appendx D, and [7] ) that a sutable expresson for () s ( ) ( ) ε y + and that y y y y. Substtutng these nto Equaton (8), some further algebrac steps gves ˆ In computng CD for each observaton n a gven dataset, there s no need to actually delete observatons one after the other and reft the lnear regresson model each tme. A lnear regresson analyss s carred out only once, and then regresson results are used to evaluate CD for each observaton. As a rule of thumb, f the CD for the th observaton devates from unty, then the th observaton s nfluental. Ths dea s somewhat general; hence we need to fnd a method whch wll determne the exact cutoff values for the CD. However, n ths paper, we examne all CD values graphcally. (The use of cutoff rule for the CD s under study). An ndex plot of CD may be a useful graphcal devce for vsualzng suspect outlers. When the CD values are all about the same, no suspect outlyng observatons are present. On the other hand, f there are observatons wth CD values that stand out from the rest, these observatons can be dentfed as outlers. 3. Implementaton of CD 3.. Usng CD to Detect Outlers n Smple Lnear egresson Analyss In ths secton, we llustrate the use of the proposed measure ( CD ) to detect outlers n smple lnear regresson analyss. The results obtaned by CD are compared wth those from some known nfluence measures revewed n Secton. The dataset used s an artfcal one created by [8] to llustrate the features of Mathematcal package for unmaskng regresson outlers. We examne for outlyng observatons by consderng the observatons that do not follow the man pattern of the bulk of the data. Even though ths procedure s an nformal way of detectng outlers, t s used as a prelmnary tool to dentfy susceptble observatons. A scatter plot for the data s shown n Fgure. CD y ˆ ε y ( y ) (9) (See Appendx C for proof). In Equaton (9), the quantty y y s the proporton of total varaton contrbuted byy ; ( ) s the amount of varaton n the dataset that excludes y ; and varaton due to y. y ˆ ε h s the amount of explaned Fgure. Scatter plot of Artfcal Data

4 Amercan Journal of Theoretcal and Appled Statstcs 04; 3(4): From Fgure, the majorty of the observatons follow a lnear pattern. Fve observatons {8, 9, 30, 3, 3} le separately from the bulk of the data. These observatons are suspected to be outlers. Observaton 8 s outlyng wth respect to ts x value. Ths observaton s not nfluental because, t les along the pattern of the bulk of the data. It can be seen from Fgure that observaton 9 s outlyng wth respect to ts y value, and therefore may be nfluental. Further, observatons {30, 3, 3} are outlers wth respect to ther x and y values. These observatons are also nfluental. A smple regresson analyss of the artfcal data yelds a regresson model summary whch s presented n Table. Table. egresson Model Summary for Artfcal Data Model sq Adj. sq Std Error of Estmate From Table, t s observed that as low as 0.3% of the varaton n the response Varable Y s accounted for by the predctor varable X. We now examne the performances of the measures of nfluence for ths dataset. The results are shown n Table. From Table, the CD measure detects observatons {8, 9, 30, 3, 3} as outlers (by means of an ndex plot, n Appendx A, of the values of CD ). These observatons have CD values that markedly devate from unty. Also, each of D and ( DFFITS) detected observatons {8, 9, D exceed 30, 3, 3} as outlers. The values of 4 4 the cut-off value f D Also, n ( k + ) 3 these observatons have absolute values of ( DFFITS) that exceed the calbraton pont of 0.5. The t measure suggests that observatons 8 and 9 are outlers snce ther absolute t values exceed the cutoff pont t 0.05, 9 ±.045. In addton, h dentfes observatons 8, 30, 3, and 3 to be outlyng but not 9. Ther values are greater than twce the average of all leverage values (0.5). Fnally, t can be seen that the CD, just lke t, classfes only observatons 8 and 9 as suspect outlers. The values of CD for observatons 8 and 9 do not fall wthn the cutoff nterval (0.83,.88). It s worth notng that the CD, besdes D and DFFITS ), s successful n detectng all the outlers n the ( data. However, h dentfes all but one outler. Each of t and CD detect only two of the fve outlers. Next, we consder an assessment of the nfluence of the outlyng sets of observatons on the value of. The results are presented n Table 3. The table gves the change n when the specfed observatons have been deleted from the data for each of the measures. CD Table. Influence Measures for Artfcal Data t h D ( DF) CV Table 3. Effect of Deleton of Outlyng Observatons on Value Outlyng Measure Observatons new change t 8, h 8, 30, 3, CV 8, D 8, 9, 30, 3, DF 8, 9, 30, 3, CD 8, 9, 30, 3, Table 3 ndcates that the omsson of the observatons {8, 9, 30, 3, 3} from the dataset results n an ncrease n the value of from 0.03 to 0.975, a substantal ncrease usng the CD. The result s the same for D and DFFITS), denoted DF n the table. ( 3.. Usng CD to Detect Outlers n Multple Lnear egresson Analyss For llustratve purposes, we use the data by Moore [9](as cted n [0]). Ths data has also been used by [9] to compare the performance of varous nfluence measures to detect nfluental observatons, hgh leverage ponts, and outlers n lnear regresson. The measured varables are: Y log (oxygen demand n dary waste), mg/mn; X bologcal oxygen demand, mg/ltre; X total Kjeldahl ntrogen, mg/ltre; X 3 total solds, mg/ltre; X 4 total volatve solds (a component of X 3), mg/ltre; X 5 chemcal oxygen demand, mg/ltre. Correlaton analyss of the data shows the presence of some lnear relatonshp between Y and each of the predctor varables, except X. Further, there s somewhat strong postve lnear relatonshp between any par of the varables X, X 3, X 4, and X 5. These lnear assocatons among the predctor varables are lkely to pose mult-collnearty problems. The summary of the ftted regresson of Y on the fve predctors s shown n Table 4.

5 04 Armyaw Zakara et al.: On the Detecton of Influental Outlers n Lnear egresson Analyss Table 4. egresson Model Summary for Moore s Data Model sq Adj. sq Std Error of Estmate Full CD Table 5. Influence Measures for Moore s Data t h D ( DF) CV From Table 4, we note that even though the value s hgh, t may not be a true representaton of the explanatory power of the ftted regresson model. In an deal stuaton, each observaton n the dataset contrbutes equally to the formaton of the value of. It s, therefore, legtmate to assess each observaton vs-à-vs ther nfluence on value n order to dentfy unusual ones. Table 5 shows the values of varous measures of nfluence for Moore s data when the th observaton s omtted from the dataset. Frst, we consder the coeffcent of determnaton rato (CD) n Table 5. It can be observed that the (CD) for all observatons are approxmately equal to one except observatons {, 7, 5, 0}. The removal of these observatons from the dataset s expected to substantally mprove the. Therefore, the observatons {, 7, 5, 0} are nfluental. To evaluate the studentsed deleted resdual t for an observaton, we compare ths quantty wth t α based on ( n k ) degrees of freedom. Specfcally, f the t s greater n absolute value than, then there s some t α, nk evdence that the observaton s an outler wth respect to ts y value. From Table 5, we see that the t for observatons and 0 (3.584 and.05, respectvely) are both greater than t.77. Therefore, we should be very concerned that 0.05,3 {, 0} are outlers wth respect to ther y values. For the Moore s data, there are n 0 observatons and snce the ftted lnear regresson model utlzes k 5 ndependent varables, twce the average leverage value s From Table 5, we see that the leverage value for observaton 7 s Snce ths value s greater than 0.60, t suggests that observaton 7 s an outler wth respect to ts x values. From Table 5, Cook s dstance for each of observatons {, 7, 0} s greater than the cut-off value Ths means that removng the group of observatons {, 7, 0} from the dataset would substantally change the least squares estmate of the regresson parameters. Hence, observatons, 7, and 0 are flagged as nfluental. It can be seen n Table 5 that the DFFITS for observatons, 7, and 0 exceed the cut-off value of.095, and therefore, they should be dentfed as outlers. For the Moore s data, the cutoff values for CV s ( 0.00,.900). The cut-off nterval s rather conservatve n that t declares too many observatons as outlers. In vew of ths, we use the ndex plot (see Appendx B) of CV to fnd outlers. Observatons and 7 are consdered as outlers. The CV value for observaton 7 s the largest ndcatng that ts presence would have the greatest mpact on ncreasng the precson of the parameter estmates. The CV for observaton s the lowest. Ths shows that the presence of observaton n the dataset greatly decrease the precson of the estmates. The nfluence of the dfferent sets of suspect outlyng observatons, from the varous measures of nfluence, on the value of s dsplayed n Table 6. Table 6. Effect of Deleton of Outlyng Observatons on Measure Outlyng Observatons Value new change t, h CV, D, 7, DF, 7, CD, 7, 5, It s observed from Table 6 that the value has ncreased as a result of deletng varous sets of suspect outlyng observatons. However, the magntude of the ncrement dffers across the sets of outlyng observatons. The greatest change n value (from 0.8 to 0.938) s assocated wth the omsson of the outlyng set {, 7, 5, 0}, whch s based on the CD measure of nfluence. However, the least change n value s lnked wth the omsson of {7}, whch s based on the h measure of nfluence. A comparson of values of n Table 6 shows that the set of observatons {, 7, 5, 0} detected by CD s the most nfluental than those emanatng from the other measures of nfluence. The deleton of ths set from the dataset would subsequently lead to a substantal change n the regresson estmates. The result shows that the CD s more responsve to dentfyng even the subtle outlers.

6 Amercan Journal of Theoretcal and Appled Statstcs 04; 3(4): Concluson and ecommendaton The man objectve of the paper was to assess the standard measures for detectng nfluental outlers n structured data. The result shows that the new measure, called the Coeffcent of Determnaton ato ( CD ) dentfes suspect outlers whch all the other measures detect. In addton, t has the property to detect other and even more subtle suspect outler observatons whch other measures do not detect. Ths means that by usng the CD, any observaton whch s potentally an outlers can be dentfed for assessment. The beneft of usng ths method s that the sgnfcance of the model obtaned eventually for summarsng the dataset s more relable. The results also show that the CD, D, and DF detect almost the same sets of nfluental outlers. However, the CD always perform dstnctly from the other nfluence measures such as the t, h, and CV. The mplementaton of the new method reled mostly on the scatter plot of the values of the CD. Lke the other measures, t would be more formal to dentfy suspect outlers usng exact cut-off values. Future studes n ths area should focus on obtanng a generalzed cut-off value for the new measure. Appendx A Index Plots for Artfcal Data Appendx B Index Plots for Moore s Data (a) (c) (b) (d) (e) (f) (a) (b) Appendx C Proof of Coeffcent of Determnaton ato The CD for the th observaton s defned as ( ) CD,,,, n () (c) (d) or ( ) CD () ( ) Now ˆ ˆ ( ) β ( ) X ( ) X( ) β( ). It can be shown ([7]) that ( ) X ( ) X( ) XX xx, where, () s sum of squares due to regresson wth the th observaton deleted, the correspondng sum of squares. We express and () (e) (f)

7 06 Armyaw Zakara et al.: On the Detecton of Influental Outlers n Lnear egresson Analyss and βˆ βˆ ( ) Substtutng y ˆ Xβ ˆ, we have βˆ ˆ ( ) X ( ) X( ) β( ) ( ) y ( ) y( ) ( ). Substtutng, we have ( ) ( ) ( ) x. (3) ( ) ( ) ( ) βˆ XXβ ˆ y y X X βˆ ( XX x x )ˆ β βˆ XXβ ˆ βˆ x x βˆ Further substtutons usng Eq. (3) gves ( ) βˆ βˆ Xβˆ Substtutng ( ) x βˆ x X X βˆ ˆ X X X x x x βˆ ( ) x Xβˆ X x ( ) ( ) ε x x βˆ x x ˆ xβ ŷ and h x x x x we have x βˆ X x X Xβˆ X x h h yˆ h yˆ h h h Expandng gves ( ) βˆ X Xβˆ h + x h βˆ X Xβˆ h Substtuton for βˆ (X x x ( ) yˆ βˆ x h y h x ˆ y h x βˆ + h h ˆ y h h X X βˆ h εˆ, further smplfcaton gves ˆ ˆ ˆ ( ) β X Xβ + ( y ) ε ( ) y + h ( y ) h ˆ ˆ ˆ h ( y ) h ε ε ε + ( ) (4) Now, ( ) y. (5) Substtutng Eq (4) and (5) nto Eq () yelds CD y y Some few further steps gves CD y eferences ε + ˆ ˆ ε y ( y ). [] Barnett, V., & Lews, T. (994). Outlers n statstcal data (3rd ed.). New York, NY: John Wley and Sons. [] Cohen, J., Cohen, P., West, S. G., & Aken, L. S. (003). Appled multple regresson/correlaton analyss for the behavoral scences (3rd ed.). London, England: Lawrence Erlbaum Assocates. [3] Wesberg, S. (005). Appled lnear regresson (3rd ed.). New York, NY: John Wley and Sons. [4] Nurunnab, A. A. M., Imon, A. H. M.., Al, A. B. M. S., & Nasser, M. (0). Outler detecton n lnear regresson. etreved June 9, 0 from [5] Chatterjee, S., & Had, A. S. (988). Senstvty analyss n lnear regresson. New York, NY: John Wley & Sons. [6] Cook,. D. & Wesberg, S. (98). esduals and Influence n egresson. New York, NY: Chapman and Hall. [7] encher, A. C. & Schaalje, G. B. (008). Lnear models n statstcs (nd ed.). New Jersey, NJ: John Wley & Sons. [8] Snksaran, E. & Satman, M. H. (0). PUO: A package for unmaskng regresson outlers. Gaz Unversty Journal of Scence, 4 (), [9] Moore, J. (975): Total bochemcal oxygen demand of dary manures. Ph. D. Thess, Unv. of Mnnesota, Dept. Agrcultural Engneerng. [0] Chatterjee, S. & Had, A. S. (986). Influental observatons, hgh leverage ponts, and outlers n lnear regresson. Statstcal Scence, (3),

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

Chap 10: Diagnostics, p384

Chap 10: Diagnostics, p384 Chap 10: Dagnostcs, p384 Multcollnearty 10.5 p406 Defnton Multcollnearty exsts when two or more ndependent varables used n regresson are moderately or hghly correlated. - when multcollnearty exsts, regresson

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty

More information

Outlier Detection in Logistic Regression: A Quest for Reliable Knowledge from Predictive Modeling and Classification

Outlier Detection in Logistic Regression: A Quest for Reliable Knowledge from Predictive Modeling and Classification Outler Detecton n Logstc egresson: A Quest for elable Knowledge from Predctve Modelng and Classfcaton Abdul Nurunnab, Geoff West Department of Spatal Scences, Curtn Unversty, Perth, Australa CC for Spatal

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two

More information

Influence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia.

Influence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia. Proceedngs of the 8th WSEAS Internatonal Conference on APPLIED MAHEMAICS, enerfe, Span, December 16-18, 5 (pp14-138) Influence Dagnostcs on Competng Rsks Usng Cox s Model wth Censored Data F. A. M. Elfak

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X). 11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students. PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton

More information

Statistical Evaluation of WATFLOOD

Statistical Evaluation of WATFLOOD tatstcal Evaluaton of WATFLD By: Angela MacLean, Dept. of Cvl & Envronmental Engneerng, Unversty of Waterloo, n. ctober, 005 The statstcs program assocated wth WATFLD uses spl.csv fle that s produced wth

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

On the Influential Points in the Functional Circular Relationship Models

On the Influential Points in the Functional Circular Relationship Models On the Influental Ponts n the Functonal Crcular Relatonshp Models Department of Mathematcs, Faculty of Scence Al-Azhar Unversty-Gaza, Gaza, Palestne alzad33@yahoo.com Abstract If the nterest s to calbrate

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =

More information

Methods of Detecting Outliers in A Regression Analysis Model.

Methods of Detecting Outliers in A Regression Analysis Model. Methods of Detectng Outlers n A Regresson Analyss Model. Ogu, A. I. *, Inyama, S. C+, Achugamonu, P. C++ *Department of Statstcs, Imo State Unversty,Owerr +Department of Mathematcs, Federal Unversty of

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unt 10: Smple Lnear Regresson and Correlaton Statstcs 571: Statstcal Methods Ramón V. León 6/28/2004 Unt 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regresson analyss s a method for studyng the

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

SIMPLE LINEAR REGRESSION and CORRELATION

SIMPLE LINEAR REGRESSION and CORRELATION Expermental Desgn and Statstcal Methods Workshop SIMPLE LINEAR REGRESSION and CORRELATION Jesús Pedrafta Arlla jesus.pedrafta@uab.cat Departament de Cènca Anmal dels Alments Items Correlaton: degree of

More information

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites

An identification algorithm of model kinetic parameters of the interfacial layer growth in fiber composites IOP Conference Seres: Materals Scence and Engneerng PAPER OPE ACCESS An dentfcaton algorthm of model knetc parameters of the nterfacal layer growth n fber compostes o cte ths artcle: V Zubov et al 216

More information

A METHOD FOR DETECTING OUTLIERS IN FUZZY REGRESSION

A METHOD FOR DETECTING OUTLIERS IN FUZZY REGRESSION OPERATIONS RESEARCH AND DECISIONS No. 2 21 Barbara GŁADYSZ* A METHOD FOR DETECTING OUTLIERS IN FUZZY REGRESSION In ths artcle we propose a method for dentfyng outlers n fuzzy regresson. Outlers n a sample

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

SOME METHODS OF DETECTION OF OUTLIERS IN LINEAR REGRESSION MODEL

SOME METHODS OF DETECTION OF OUTLIERS IN LINEAR REGRESSION MODEL SOME METHODS OF DETECTION OF OUTLIERS IN LINEAR REGRESSION MODEL RANJIT KUMAR PAUL M. Sc. (Agrcultural Statstcs), Roll No. 4405 IASRI, Lbrary Avenue, New Delh-11001 Charperson: Dr. L. M. Bhar Abstract:

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Diagnostics in Poisson Regression. Models - Residual Analysis

Diagnostics in Poisson Regression. Models - Residual Analysis Dagnostcs n Posson Regresson Models - Resdual Analyss 1 Outlne Dagnostcs n Posson Regresson Models - Resdual Analyss Example 3: Recall of Stressful Events contnued 2 Resdual Analyss Resduals represent

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Testing for outliers in nonlinear longitudinal data models based on M-estimation

Testing for outliers in nonlinear longitudinal data models based on M-estimation ISS 1746-7659, England, UK Journal of Informaton and Computng Scence Vol 1, o, 017, pp107-11 estng for outlers n nonlnear longtudnal data models based on M-estmaton Huhu Sun 1 1 School of Mathematcs and

More information

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)

More information

T E C O L O T E R E S E A R C H, I N C.

T E C O L O T E R E S E A R C H, I N C. T E C O L O T E R E S E A R C H, I N C. B rdg n g En g neern g a nd Econo mcs S nce 1973 THE MINIMUM-UNBIASED-PERCENTAGE ERROR (MUPE) METHOD IN CER DEVELOPMENT Thrd Jont Annual ISPA/SCEA Internatonal Conference

More information

Regression. The Simple Linear Regression Model

Regression. The Simple Linear Regression Model Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng the Estmated Regresson Equaton for Estmaton and Predcton Resdual Analss: Valdatng

More information

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X. ( ) ( ) where ( ) 1 ˆ β = X X X X β + ε = β + Aε A = X X 1 X [ ] E ˆ β β AE ε β so ˆ = + = β s unbased ( )( ) [ ] ˆ Cov β = E ˆ β β ˆ β β = E Aεε A AE ε ε A Aσ IA = σ AA = σ X X = [ ] = ( ) 1 Ftted values

More information

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the Chapter 11 Student Lecture Notes 11-1 Lnear regresson Wenl lu Dept. Health statstcs School of publc health Tanjn medcal unversty 1 Regresson Models 1. Answer What Is the Relatonshp Between the Varables?.

More information

Regulation No. 117 (Tyres rolling noise and wet grip adhesion) Proposal for amendments to ECE/TRANS/WP.29/GRB/2010/3

Regulation No. 117 (Tyres rolling noise and wet grip adhesion) Proposal for amendments to ECE/TRANS/WP.29/GRB/2010/3 Transmtted by the expert from France Informal Document No. GRB-51-14 (67 th GRB, 15 17 February 2010, agenda tem 7) Regulaton No. 117 (Tyres rollng nose and wet grp adheson) Proposal for amendments to

More information

RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA

RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA Operatons Research and Applcatons : An Internatonal Journal (ORAJ), Vol.4, No.3/4, November 17 RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEAED MEASUREMEN DAA Munsr Al, Yu Feng, Al choo, Zamr

More information

Research Article On the Performance of the Measure for Diagnosing Multiple High Leverage Collinearity-Reducing Observations

Research Article On the Performance of the Measure for Diagnosing Multiple High Leverage Collinearity-Reducing Observations Hndaw Publshng Corporaton Mathematcal Problems n Engneerng Volume 212, Artcle ID 53167, 16 pages do:1.1155/212/53167 Research Artcle On the Performance of the Measure for Dagnosng Multple Hgh Leverage

More information

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

CHAPTER 8. Exercise Solutions

CHAPTER 8. Exercise Solutions CHAPTER 8 Exercse Solutons 77 Chapter 8, Exercse Solutons, Prncples of Econometrcs, 3e 78 EXERCISE 8. When = N N N ( x x) ( x x) ( x x) = = = N = = = N N N ( x ) ( ) ( ) ( x x ) x x x x x = = = = Chapter

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons

More information

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 3. Two-Variable Regression Model: The Problem of Estimation Chapter 3. Two-Varable Regresson Model: The Problem of Estmaton Ordnary Least Squares Method (OLS) Recall that, PRF: Y = β 1 + β X + u Thus, snce PRF s not drectly observable, t s estmated by SRF; that

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information